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J. P. GUILFORD: 
PSYCHOLOGIST AND TEACHER 
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ANDREW L. COMREY 
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AND BENJAMIN FRUCHTER 
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A review of the life and contributions of one of the foremost psychol- 
logists of our time. The review covers the contributions of Guilford to 
experimental psychology—experimental esthetics, sensory processes, 
perception, and learning; statistical psychology—psychophysics, analy- 
sis, test theory and evaluation; measurement of mental abilities—apti- 
tudes of high level personnel, structure of the intellect, and creativity; 
and personality—tests, questionnaires, and inventories by Guilford and 
his associates. A complete bibliography shows that Guilford published 
either as sole author or as co-author 21 books; 29 monographs; 158 
articles; and 21 tests, manuals, and profile sheets. A portrait of Guilford 


is included. 


When J. P. Guilford retired from 
his formal full-time duties at the 
University of Southern California in 
June 1962, he had already devoted 
more than 35 years of distinguished 
service to the profession of psychol- 
ogy. As a dedicated scientist, a 
prolific writer, and a great teacher, 
Guilford’s influence upon thedevelop- 
ment of a science of psychology has 
been of the first magnitude. His 
numerous research contributions 
have been not only diversified in 
scope, but also penetrating in con- 


-cept. No less important in winning 


for Guilford a following of loyal stu- 
dents and colleagues in every branch 
of psychology have been his countless 
pleasing personal qualities. His fair- 
ness, kindness, patience, and helpful- 
ness are known to all who have stud- 
ied or worked with him. Generous 
with his time he has stimulated the 


development of insightful thinking by 
others and has encouraged scores of 
graduate students to become psy- 
chologists—students who never have 
had cause for feeling threatened in 
seeking his counsel. 

Born March 7, 1897, on a farm 
near Marquette, Nebraska, Guilford 
was graduated in 1914 from high 
school in Aurora, Nebraska. ]t was 
in his last year of high school that he 
was first exposed to the rudiments of 
psychology in a course aimed at the 
training of teachers. After working 
for a year on the farm, he taught dur- 
ing the 1915-16 academic year all 
eight grades in a county school just 
outside Phillips, Nebraska. Through- 
out the following year he taught 
Grades 5 through 8 at the village 
school in Phillips. Between Septem- 
ber 1917 and June 1918 he completed 
his first year of college at the Uni- 
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versity of Nebraska, and in Septem- 
ber 1918 entered the United States 
Army as a private. With the First 
World War soon over, Guilford re- 
turned in the Spring of 1919 to teach 
Grades 7 and 8 in Hooper, Nebraska, 
only to become Acting County Super- 
intendent of Schools in his home 
county during the summer. 

Returning to the University of 
Nebraska in 1919 he started to major 
in chemistry, but found psychology 
more to his liking. Largely because 
of the influence of one great teacher 
who recognized his talents and took a 
keen interest in seeing his academic 
career furthered, Guilford was won 
over to psychology for good. This 
inspiring and perceptive teacher was 
Winifred Hyde. How often, as many 
a reader has probably noticed, the 
influence of just one or two enthusi- 
astic teachers has served to change 
the entire course of one’s school 
ded and subsequent professional 
life! 

So much confidence did Winifred 
Hyde have in her protegé that she 
arranged for an assistantship to 
carry him through his junior and 
senior years—an honor customarily 
reserved for graduate students. After 
completing his AB degree in 1922, 
Guilford continued at the University 
of Nebraska until 1924 to earn his 
master's degree. 

It was decided that he should go 
on for his doctorate at Cornell Uni- 
versity. In the hands of such famous 
teachers as Titchener, Koffka,Dallen- 
bach, and Helson, Guilford gained 
the broad base in experimental and 
theoretical psychology that prepared 
him to undertake serious study of 
many problem areas within psychol- 
ogy. During his years of graduate 
study he was active in research as 
evidenced by the fact that he was au- 
thor, or co-author, of five articles. 
His doctorate was conferred in 1927. 


It was during the summer of 1926 
that Guilford met Ruth Burke, a 
graduate student in psychology, 
whom he married on September 8, 
1927. Almost 13 months later on 
September 28, 1928, their daughter 
was born. The month of September 
must have a magical quality for the 
Guilford family, for their daughter, i 
Joan McClung, completed her doc- 
torate in psychology on September 
21, 1962. 

Guilford climbed the academic 
ladder swiftly. After serving as an 
instructor for the year 1926-27 at the 
University of Illinois, he spent the 
following academic year as an assist- 
ant professor at the University of 
Kansas. In the fall of 1928 he as- 
sumed the position of associate pro- 
fessor at his alma mater, the Uni- 
versity of Nebraska, where he was 
promoted to full professor in 1932. 
During the period 1938-40 he also 
served as Director of the Bureau of 
Instructional Research. 

After learning of the delights of 
California living while teaching dur- 
ing the summer sessions of 1938 and 
1939 at the University of Southern 
California, Guilford accepted a full- 
time position as Professor of Psychol- 
ogy at the same university beginning 
in the fall of 1940—a position which 
he retained until his formal retire- 
ment last June. So highly had the 
University of Nebraska respected 
Guilford's achievement it awarded 
him an honorary degree of Doctor of 
Laws in 1952. At the June 1962 
Commencement the University of 
Southern California, in recognition of 
his distinguished service, conferred 
upon him the honorary degree of 
Doctor of Science. At the earnest re- 
quest of the University Guilford has 
consented to serve part time in teach- 
ing and to continue participation in 
research contracts concerned with his 
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work upon creativity and high level 
aptitudes. 

During the years of World War II 
Guilford took leave of his academic 
duties from March 1942 to January 
1946, and through participation in 
the psychological research units of 
the United States Army Air Force 
gave direction to the large scale 
testing programs for selection and 
placement of air cadets. At the time 
of his termination from the service, he 
held the rank of Colonel. He was also 
decorated with the Legion of Merit. 
It was during these wartime years 
that he met in the psychological re- 
search units many service personnel 
who subsequently took PhD degrees 
with him at the University of South- 
ern California. 

That Guilford’s professional career 
has been an illustrious one can be 
attested to by the fact that he has 
been President of the following or- 
ganizations: the Psychometric Soci- 
ety, 1937-38; the Midwestern Psy- 
chological Association, 1939-40; the 
Western Psychological Association, 
1946-47; the California Color Society, 
1948-49; and the American Psycho- 
logical Association, 1949-50. Besides 
being a Diplomate, American Board 
of Examiners in Professional Psy- 
chology, he also has been President 
of the Division of Evaluation and 
Measurement, 1956-57, and of the 
Division of Esthetics, 1956-57, of the 
American Psychological Association. 
To these honors may be added mem- 
bership in the societies of Sigma Xi, 
Phi Beta Kappa, Phi Kappa Phi, Psi 
Chi, Phi Delta Kappa, and Phi Sigma. 
A few other important organizations 
in which he holds membership in- 
clude the National Academy of Sci- 
ences, the Society of Experimental 
Psychologists, the Society for Multi- 
variate Experimental Psychology, 
and the American Association for the 
Advancement of Science. 


That his critical judgment is fre- 
quently sought is also readily appar- 
ent in his holding positions on the 
editorial boards of such journals as 
the Psychological Bulletin, Psycho- 
metrika, American Journal of Psy- 
chology, and Educational and. Psycho- 
logical Measurement. In the past he 
has been on the editorial board of the 
Journal of Experimental Psychology, 
the Journal of Abnormal and Social 
Psychology, the Journal of Applied 
Psychology, and the Psychological 
Review. 

Busy as he is, Guilford does find 
some time for recreation. The garden 
at his home is one of which any pro- 
fessional horticulturalist would be 
rightfully proud. No less remarkable 
are his countless color slides and 
numerous reels of motion pictures 
and tape recordings of music. 

Guilford's direct contributions to 
psychology are, of course, apparent in 
his many publications. Since many 
of these publications closely parallel 
his research endeavors they constitute 
a reasonably valid criterion of his 
thinking throughout his professional 
career. Although no attempt will be 
undertaken to review each contribu- 
tion, an effort will be made to sum- 
marize and to evaluate certain logi- 
cally related groupings of his worksas 
well as certain selected publications. 

In addition to his several books, 
Guilford’s contributions may 
viewed for conceptual convenience as 
falling into the categories of (a) ex- 
perimental psychology including such 
areas as attention and visual percep- 
tion, learning and memory, and ex- 
perimental esthetics: (b) statistical 
psychology embracing such problem 
fields as test theory and evaluation, 
factor analysis, psychophysics, and 
scaling; (c) mental abilities with 
emphasis upon the knowledge gained 
from factor analytic studies and from 
the theoretical model of the structure 
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of intellect;and (d) personality theory 
and measurement again based largely 
upon factor analytic formulations. 

For ease of reference to the Bibli- 
ography the following letters are used 
to designate different kinds of writ- 
ings. The Letter A stands for a book; 
B for a monograph; C for a journal 
article or for a chapter in a book; D 
for a psychological test, manual, or 
profile sheet; and E for a publication 
with which Guilford was zot associ- 
ated. As an example: Guilford, 
C1952d, would stand for the fourth 


article published by Guilford during . 


1952. 


Books 


Guilford has been the author, co- 
author, contributor, or editor of more 
than 20 books in psychology. 
Through these volumes his impact 
upon the teaching of psychology has 
been great indeed. Although each of 
them has left its imprint upon psy- 
chology, probably his two most 
widely used and influential books 
have been Psychometric Methods 
(Guilford, A1936, A1954) and Funda- 
mental Statistics in Psychology and 
Education (Guilford, A1942, A1950, 
A1956). Throughout the world both 
editions of Psychometric Methods 
have been familiar to professors and 
students engaged in problems of 
quantitative psychology. This vol- 
ume has served to unify develop- 
ments in psychophysical methods and 
scaling, test theory and development, 
and factor analysis. The chapter 
upon factor analysis which still re- 
mains one of the standard reference 
guides for many a research psycholo- 
gist and graduate student is in itself 
nearly sufficient for a one-semester 
graduate course in the subject. Per- 
haps in retrospect one may say that 
Psychometric Methods has been Guil- 
ford’s greatest contribution to psy- 
chology, although future history may 


reveal his fundamental and system- 
atic work on a theory of the structure 
of the human intellect to be equally, 
or even more, noteworthy. 

Other well known books with 
which Guilford has been identified 
have included the Printed Classifica- 
Hon Tests (Guilford, A1947), one of 
the comprehensive reports of the 
Army Air Forces Aviation Psychology 
Research Program, General Psychol- 
ogy (Guilford, A1939, A1952), Fields 
of Psychology (Guilford, A1940), Per- 
sonality (Guilford, A1959), and the 
Prediction of Categories from Measure- 
ments (Guilford & Michael, A1949). 
Representing sound scholarship and a 
high degree of clarity in exposition 
these books along with his others have 
played an important róle in the train- 
ing and in the sparking of lasting 
interests in psychology for many an 
undergradute and graduate student. 
When these books are viewed in com- 
bination with his monographs, jour- 
nal articles, and tests, it can be seen 
that Guilford has been one of the most 
productive and creative psycholo- 
gists on the contemporary scene. 
What is most rewarding indeed is that 
there is no sign of any slackening in 
his rate of publication. In fact the 
writers feel that they are taking little 
risk in predicting that his creative 
efforts will increase and will continue 
to add much to the understanding of 
human behavior for many years to 
come. 


CONTRIBUTIONS TO EXPERIMENTAL 
PsvcuoLocv 


As examination of Guilford's writ- 
ings will readily indicate the Scope of 
papers in experimental psychology 
has been broad. Although the influ- 
ence of the Cornell tradition is ap- 
parent, there are many contributions 
that seem to fall well outside the 
range of problems of psychology as 
circumscribed by Titchener. 
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Attention and. Visual Perception 

Guilford's doctoral dissertation 
“ ‘Fluctuations of Attention’ with 
Weak Visual Stimuli” was published 
in the American Journal of Psychol- 
ogy (Guilford, C1927a) subsequent to 
the appearance of five articles during 
1925 and 1926. A student of the his- 
tory of psychology will find Guilford's 
review of the literature of experi- 
mental studies dealing with attention 
to be a rewarding experience. In 
studying fluctuations with limited 
visual stimuli Guilford attributed the 
phenomenon embodying changes in 
minimal sensory experiences to the 
nature of the limen itself. He pointed 
out that (a) the duration of periods of 
visibility and invisibility are related 
to the intensity of the stimulus and 
(b) both peripheral and central physi- 
ological conditions are involved. 
Thus, on the basis of both psycho- 
physical evidence and introspective 
reports of his subjects he concluded 
that the phenomenon he was investi- 
gating could not be called fluctuation 
of attention. 

From his study of eye movements 
in conjunction with his investigation 
of the fluctuation of attention, Guil- 
ford became interested in testing eye- 
movement theories of autokinetic 
sensation through making photo- 
graphs of these movements during the 
illusion. In an article with Dallen- 
bach he (Guilford & Dallenbach, 
C1928) found that eye-movement 
theories were inadequate to explain 
the autokinetic sensations, since the 
illusion took place in the absence of 
eye movements. In a follow-up paper 
Guilford (C1928a) concluded from his 
experimental results that streaming 
phenomena which involve the entire 
retina at once can account for the 
autokinetic sensation. 

Continuing his research with eye 
movements in relation to the occur- 
rence of an apparent visual move- 


ment from stationary stimuli, known 
as the phi phenomenon, Guilford and 
Helson (C1929) demonstrated that 
there is no correlation between the 
phenomenon and the presence of eye 
movements. With respect to two sets 
of instructions involving the descrip- 
tion by subjects of attributive prop- 
erties and the cognitive properties of 
48 exposure fields Guilford, in collab- 
oration with Hackman (Guilford & 
Hackman, C1936), studied interrela- 
tionships between numbers and di- 
rections of eye movements and data 
from the reports associated with each 
of the sets of instructions. On the 
basis of their findings they proposed 
a theory of levels of clearness in 
visual attention and speculated about 
the application of the theory to other 
sensory modalities. 


Learning and Memory 


An often quoted study by Guilford 
(C1927b) has been the one concern- 
ing the influence of form in learning. 
Actually, Guilford anticipated much 
of what the Gestaltists would say 
regarding learning in the thirties. 
Thus in his experiment he demon- 
strated that ease of memorizing a 
series of numbers greatly depends 
upon the presence of a readily grasped 
pattern or structured form that 
could also be perceived by way of 
successive inductions. Two years 
earlier Guilford, collaborating with 
Dallenbach (Guilford & Dallenbach, 
C1925), determined immediate mem- 
ory span from use of the method of 
constant stimuli. 


Experimental Esthetics 


For more than 30 years Guilford has 
shown an active professional interest 
in the affective value of stimuli from 
the various sensory modalities. His 
most persistent efforts have been di- 
rected toward the development of a 
system of color preferences. In his 
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1938 Presidential Address to the 
Psychometric Society, “A Study in 
Psychodynamics," Guilford (C1939) 
reported upon the empirical relation- 
ships of the affective values of colors 
to the visual properties of hue, tint 
(brightness), and chroma (saturation) 
at various measurable levels, pre- 
sented useful isohedon charts, and 
suggested several theoretical impli- 
cations. In a comprehensive consider- 
ation of previously obtained data for 
color preferences Guilford and Smith 
(C1959) formulated several generali- 
zations and reported on the degree of 
accuracy with which predictions of 
affective value could be made as a 
function of hue, brightness, and satu- 
ration. In general the results of the 
studies have indicated that prefer- 
ences are highest in the span from 
green to blue and lowest in the yellow 
and yellow-green region when satura- 
tion and brightness are held constant. 
As a rule affective values are posi- 
tively correlated with saturation and 
brightness, although the functional 
relationships tend to be curvilinear. 

Collaborating with Holley, Guil- 
ford (Guilford & Holley, C1949) de- 
scribed and illustrated a factor ana- 
lytic approach to account for the 
variance associated with the esthetic 
judgments of 12 participants who 
rated 115 playing-card designs. This 
investigation was typical of so many 
by Guilford in that it contained both 
substantive findings and important 
methodological implications for a 
quantitatively oriented psychology. 


Other Contributions 


Guilford participated in several 
other experimental studies involving 
apprehension of time (Guilford, 
C1926a), the reading of facial expres- 
sion (Guilford, C1929a), the inhibi- 
tion and control of breathing (Vogeler 
& Guilford, C1931, C1932), the fluc- 
tuation in perceptions of ambiguous 


figures by psychotic patients (Hunt 
& Guilford, C1933), the relations of 
visual sensitivity to the amount of 
retinal pigmentation (Helson & Guil- 
ford, C1933); the patellar reflex in 
relation to personality variables 
(Guilford & Hall, C1937); reaction 
time as an indicator of attention 
value (Guilford & Ewart, C1940), to 
say nothing of the numerous experi- 
mentally oriented investigations in 
the areas of personality, mental 
measurement, creativity, and psycho- 
physics. 


CONTRIBUTIONS TO STATISTICAL 
PsYCHOLOGY 
Test Theory, Analysis, and Evaluation 


Beginning in 1936 a whole series of 
articles concerned with the applica- 
tion of statistical procedures to the 
analysis and evaluation of test data 
appeared—papers that in many in- 
stances cut across the areas of psycho- 
physics and factor analysis. Follow- 
ing publication of his practically ori- 
ented paper (Guilford, C1936a) con- 
cerning the determination of difficulty 
for multiple-choice items when chance 
success is involved, Guilford (C1937b) 
proceeded to relate difficulty of 
mental test items as determined by 
absolute scaling methods to Fechner’s 
psychophysical formulation. Later he 
(Guilford, C1941a) demonstrated 
that a range in the level of difficulty 
of items is likely to be related to the 
factorial composition of a test. 

Of considerable impact upon item 
analysis was Guilford’s (C1941d) 
paper concerning the use of the phi 
coefficient and of chi square as indices 
of the validities of test items. His 
abacs for estimating phi coefficients 
of item discrimination are still widely 
used. Extending the results achieved 
in this paper he (Guilford, C1941e) 
proposed a simple scoring weight for 
multiple-choice items and a formula 
for its standard error. With the as- 


J. P. GUILFORD: PSYCHOLOGIST AND TEACHER 7 


sistance of Lovell and Williams, 
Guilford (Guilford, Lovell, & Wil- 
liams, C1942) employed his formula 
to ascertain the validity and reliabil- 
ity of a test when wrong responses in 
multiple-choice items are differen- 
tially weighted and concluded that 
the system followed yielded no signif- 
icant gains over unweighted scoring. 

The phi coefficient was a stimulus 
for two other papers. Collaborating 
with Perry, Guilford (Guilford & 
Perry, C1951) proposed several ways 
of determining other coefficients of 
correlation from the phi index. Em- 
phasizing applications to item analy- 
sis Guilford (Michael, Perry, & Guil- 
ford, C1952), in a publication co- 
authored by Michael and Perry, 
developed formulas for estimating a 
point biserial coefficient from a phi 
coefficient. His latest paper upon 
item analysis furnished a formula for 
estimating the correlation of an item 
with a composite from which the item 
itself has been removed (Guilford, 
C1953a). 

Factor analysis in relation to test 
theory. Shortly after the close of 
World War II Guilford (C1946) pub- 
lished a definitive paper based upon 
his experiences in the psychological 
research units of the Army Air Force 
(AAF) in which he suggested several 
criteria for the evaluation of a test. 
The standards proposed represented 
a somewhat marked departure from 
what had been considered useful de- 
grees of reliability and validity for 
tests especially when a test is part of a 
factorially oriented battery. He was 
able to give examples from the AAF 
program of tests with high validities 
that were not so useful as others with 
lower validities and of a few tests 
with relatively low reliabilities that 
furnished useful indices of validity. 
His basic philosophy as to the utility 
of the factor analytic model in test 
construction and evaluation was re- 


iterated and amplified in two subse- 
quent papers (Guilford, C1948a, 
C1948b), the first of which was his 
Presidential Address in 1947 to the 
Western Psychological Association, 
Concurrent with and subsequent to 
his expression of his point of view has 
been the development and revision by 
a number of test publishers of tests of 
factored abilities. 

Pursuing his interest in factor 
analytic methodology as applied to 
test theory Guilford working with 
Michael (Guilford & Michael, C1948) 
devised formulas based on use of sup- 
pressor variables for estimating uni- 
vocal (pure) factor scores. Subse- 
quently, Guilford and Michael 
(C1950) proposed formulas for esti- 
mating changes in the magnitudes of 
common-factor loadings when tests 
are lengthened or shortened in a 
homogeneous fashion. 

Factor analysis more broadly con- 
ceived. Although the application of 
factor analytic procedures to prob- 
lems of testing and evaluation has 
been of major interest to Guilford, his 
use of the factor analytic model of 
quantitative psychology as a tool for 
testing hypotheses has been useful in 
theory validation. Perhaps one of the 
most important papers upon method- 
ology in factor analysis as it relates to 
substantive problems was the one by 
Guilford (C1952d) in which he stated 
in a clear-cut way circumstances 1n 
which one should ot factor analyze. 
A more positive approach was rep- 
resented in a recent paper (Guilford, 
C1961a) in which the values of the 
factor analytic model to theoretical 
psychology were described. His faith 
in the correlation and factor analytic 
procedures as research tools for theory 
building in psychology was re-empha- 
sized in an invited paper at the 
Twenty-Fifth Anniversary Observ- 
ance of the Psychometric Society 
held in Chicago on September 6and7, 
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1960 (Guilford, C1961b). In the 
same paper he also indicated from a 
survey of articles in three quantita- 
tively oriented journals trends in 
psychological measurement and sta- 
tistical psychology. Additionally he 
cited several active issues and de- 
velopments in psychophysics, scaling, 
and mathematical models. A some- 
what similar presentation important 
for its conceptual orientation was 
a chapter entitled "Psychological 
Measurement" appearing in a recent 
book Current Psychological Issues 
(Guilford, C1958f). 


Psychophysics and Scaling 


Even a cursory inspection of Guil- 
ford's bibliography will reveal a large 
number of articles dealing with 
methodology of psychophysics or 
scaling as well as with applications of 
scaling techniques to important sub- 
stantive problems. Although, as 
mentioned previously, the two edi- 
tions of Psychometric Methods have 
afforded a comprehensive and pene- 
trating overview of psychophysical 
methods and scaling techniques, sev- 
eral specific contributions in the form 
of journal articles are noteworthy. 
After reviewing Thurstone's “equa- 
tion of comparative judgment” Guil- 
ford (C1928b) offered a simplification 
for the method of pair comparisons, 
and 4 years later he (Guilford, C1932) 
proposed a generalized psychophysi- 
cal law involving a power function, 
the two parameters of which would 
need to be determined experiment- 
ally. Making use of the method of 
choices Guilford (C1937c) set forth 
and illustrated procedures for deriv- 
ing scale values. When visual stimuli 
were judged by the method of equal 
appearing intervals Guilford (C1938a) 
noted discrepancies between the scale 
values derived and those found by 
pair comparisons and absolute scaling 
procedures. Moreover, in the region 


of the higher intervals the units 
derived from the pair-comparison 
approach not only tended to be larger 
than those found through use of the 
method of equal appearing intervals, 
but also were inconsistent with ex- 
pectations from Fechner's law. 

More than 15 years later Guilford 
and Dingman (C1954, C1955) pub- 
lished two important papers in the 
area of psychophysical methods. In 
the first paper they carried out five 
experiments with lifted weights in 
order to validate psychophysical 
methods in which ratio judgments 
are utilized. In addition to demon- 
strating that the fractionation and 
constant-sum method gave entirely 
concordant ratios when stimuli were 
judged in pairs, and to showing con- 
siderable agreement in the scale 
values and ratios obtained from 
various stimulus levels and from 
certain variations in the constant- 
sum method, they found support for 
the psychophysical power law with 
the power being slightly in excess of 
unity. In the second paper they pro- 
posed a modification in the method 
of equal-appearing intervals through 
the use of anchoring stimuli at the 
extremes of the list of stimuli. Their 
purpose was to counteract the ap- 
pearance of truncated frequency dis- 
tributions at these extremes. Experi- 
mentally, through use of stimulus 
weights, they succeeded not only in 
reducing the “end effect," but also in 
obtaining over the range of stimuli 
utilized strong support for Fechner's 
law—a result which being contrary to 
that found in their previous study 
they were inclined to attribute to the 
psychophysical method used. 

Guilford's work in the methodology 
of rating scales has been noteworthy. 
A little more than 15 years after he 
described several constant errors in 
ratings (Guilford, C1938d), Guilford 
(A1954) presented a comprehensive 
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rationale for errors of rating. Mak- 
ing use of an analysis of variance 
approach he attempted to isolate 
portions of variance in ratings at- 
tributed to differences between raters, 
between traits, between ratees and to 
interaction between raters and traits, 
between raters and ratees, and be- 
tween ratees and traits. Collaborat- 
ing with Dingman he (Dingman & 
Guilford, C1954) proposed a new pro- 
cedure for obtaining a weighted com- 
posite of ratings. However, his con- 
cern with problems of validity as well 
as with reliability of ratings was 
succinctly set forth in his 1960 address 
to the Psychometric Society (Guil- 
ford, C1961b). He pointed out that 
raters commonly confuse the dimen- 
sions on which evaluations are being 
made and pleaded for one's learning 
as much as one can about the nature 
of ratings before making use of them. 
Subsequently, Guilford, Christensen, 
Taaffe, and Wilson (C1962) gave 
careful scrutiny to the use of ratings 
and suggested that a factor analysis 
of the items on rating scales along 
with well defined marker variables be 
made whenever possible in order to 
clarify the nature of the character- 
istics being evaluated. 


CONTRIBUTIONS TO THE THEORY 
AND MEASUREMENT OF MENTAL 
ABILITIES 
World. War II Period 


At the outbreak of World War II 
an aviation psychology program was 
set up in the AAF with its central 
office in Washington, D. C. Three 
major centers were established to 
develop needed psychological instru- 
ments and procedures for the screen- 
ing and classification of aviation 
cadets. These centers were called 
Psychological Research Units (PRU). 
When asked to organize and to direct 
PRU Number 3, which was located at 
the Santa Ana Aviation Cadet Cen- 


ter, California, Guilford answered 
this call to duty with characteristic 
energy and effectiveness, Although 
his previous experience had been 
primarily as a scholar rather than as 
an administrator, he accepted a com- 
mission in the Armed Services at this 
time of crisis and went about the task 
of organizing a large-scale effective 
organization to do research and 
development and to carry out testing 
and classification procedures for the 
large number of cadets flowing into 
the Santa Ana Center. 

The primary research assignment 
of PRU Number 3 was in the area of 
developing group tests of aptitudes 
and abilities. The extensive work 
done in this program on paper-and- 
pencil tests is summarized in the 
volume edited by Guilford (A1947). 
Experimental forms of tests in 18 
ability areas were constructed and 
administered to thousands of aviation 
cadets. Many of the tests were vali- 
dated against pass-fail and other 
empirical criteria in aircrew training 
programs. A battery of paper-and- 
pencil and psychomotor tests for the 
purpose of classification of cadets into 
pilot, navigator, and bombardier 
training courses was assembled. 

The best methods of test construc- 
tion and evaluation available at the 
time were applied. Each test was 
item-analyzed for internal consist- 
ency, and its reliability and predic- 
tive validity were determined. After 
batteries of experimental tests had 
been assembled in many of the ability 
areas, factor analyses were performed 
to gain a better idea of the interrela- 
tionships among the tests and the 
basic dimensions they represented. 

Factor analyses were also per- 
formed of the operational classifica- 
tion batteries with the addition of 
success-criterion variables in the air- 
crew specialties, e.g., pilot. Since this 
procedure made it possible to assess 
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the abilities being measured by the 
aptitude battery which were also 
present in the criterion variance, it 
amounted to a factorial approach to 
occupational analysis. 

The test-development and classifi- 
cation program in the AAF Aviation 
Psychology Program was a proving 
ground for many of the theories and 
procedures of psychometrics and of 
vocational selection. Much new 
ground was broken and many existing 
theories were put to a large-scale test 
for the first time. In a series of 
articles, Guilford (C1946, C1947, 
C1948a) drew the conclusions and 
lessons to be learned and publicized 
them. 

In generalizing the approach used in 
the AAF program, Guilford (C1948a) 
presented the rationale for using 
factor analysis in guiding a test- 
development program. He showed 
how knowledge of the factorial con- 
tent of tests and of job criteria could 
lead to insight into the identification 
of the individual differences in abili- 
ties entering into job performance 
and to an estimate of the proportion 
of the variance accounted for by each 
of these abilities. A search could then 
be instituted to find or to construct 
measures which would be related to 
those unexplained portions of the 
criteria that were reliably determined. 

In the several score studies made in 
the AAF program a number of apti- 
tude factors were repeatedly identi- 
fied and verified. Some of these were 
the primary factors which had been 
established in prewar investigations, 
and others were the result of the new 
ground that had been broken in the 
aviation psychology investigations, 
Guilford (C1947) gave a listing and 
description of these factors in Science 
and discussed their implications for 
the understanding of human behav- 
ior. He also described the factorial 

approach to intelligence and aptitude 


testing in a French-language publica- 
tion (Guilford, C1952a), and with 
Zimmerman he (Guilford & Zimmer- 
man, C1947) reviewed the status of 
aptitude factors from the standpoint 
of vocational guidance. 

A more general overview of the 
aviation psychology program and of 
the lessons to be drawn from it was 
given by Guilford (C1948b) in his 
Presidential Address to the West- 
ern Psychological Association. He 
pointed out that psychology came of 
age during that period and estab- 
lished itself through objectively dem- 
onstrated usefulness. Questions with 
which academic psychologists had 
long been concerned, such as trans- 
fer-of-training and learning prin- 
ciples, had immediate application to 
the evaluation of simulators and of 
other training devices. 

The importance of interpreting a 
correlation coefficient in terms of its 
meaning for a given situation rather 
than in terms of arbitrary standards 
was exemplified. Additionally, stress 
was placed upon use of factor analysis 
to establish objectively derived di- 
mensions of individual differences 
rather than the subjectively derived 
systems of categories that are fre- 
quently used in job analysis and ex- 
perimental psychology. 

The principal findings on the na- 
ture of human abilities were reviewed. 
In the first place, it was not necessary 
to assume a factor of general intelli- 
gence to account for the interrelation- 
ships among ability tests. Several of 
the basic primary ability factors such 
as verbal comprehension, numerical 
facility, and perceptual speed ap- 
peared repeatedly in the analyses. 
The previously identified space fac- 
tor was shown to represent two dis- 
tinct abilities: visualization and spa- 
tial orientation. Further clarification 
was obtained concerning the nature 
of memory and psychomotor abilities. 
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The achievement areas of mechanical 
knowledge and/or mechanical ex- 
perience were added to the aptitude 
measures in order to obtain greater 
predictability for some criteria. The 
results in the area of reasoning ability 
were not clear. Several factors iso- 
lated did not fit in well with the usual 
rubrics of induction and deductive 
reasoning. 

In the instance of nonability meas- 
ures the follow-up results and the 
predictive validities were less en- 
couraging. Interest and personality 
measures did not hold up well. Both 
the clinical evaluations and perform- 
ance tests of temperament that were 
tried out were not promising. Bio- 
graphical data and information tests 
when constructed and keyed for 
specific purposes yielded useful va- 
lidities. Guilford’s closing remarks 
are still pertinent: 


We psychologists occupy a key position, for 
in a world that has grown so small and in a 
world society that has become so interde- 
pendent, the problems of human relations are 
most important and their solution is para- 
mount, We have won considerable confidence 
in our preparedness to solve certain types of 
human problems. In order to maintain that 
confidence we shall have to continue to dem- 
onstrate competence. In order to cope with 
the larger human problems successfully, we 
shall have to demonstrate increased com- 
petence. All of this leads to the same conclu- 
sion. We must see to it that the psychologists 
whom we train are fully prepared not merely 
to fulfill the requirements as we have recently 
found them but to do even better than that. 
During this unique period with its pressure 
for more trained psychologists, the market is 
definitely in our favor. It is such astoattract 
able students in large numbers in our direc- 
tion. Let us recognize the fact that it requires 
a high level of ability and of stability to bea 
good psychologist and to be an effective public 
leader. Let us remember that of all scientists 
the psychologist must have the broadest ed- 
ucational base and the most varied and inten- 
sive drilling in logical, technical, and observa- 
tional procedures. Psychologists will achieve 
positions of leadership and will remain in 
those positions by reason of sound prepara- 
tion for them. If, as I have said before, 
psychology has arrived, this achievement isa 


vantage point at which even greater things 
are expected of us and which carries the 
challenge to be prepared to make good on new 
promises. Whether we go forward from there 
or stop where we are is up to us. 


At the close of World War II a 
large number of printed experimental 
tests had been constructed for which 
there was not sufficient time to carry 
out factor analyses and validation 
studies in the usual manner. One 
group of 39 aptitude tests was ad- 
ministered, in overlapping batteries, 
to a large sample of aviation students 
at Sheppard Field, Texas, in 1945. 
The factor analysis of this battery 
was subsequently completed by Guil- 
ford, Fruchter, and Zimmerman 
(C1952). The general conclusions 
from this analysis were as follows: 

1. In general, previously obtained 
factors, in the AAF results and else- 
where, were confirmed, as were their 
relationships to specific tests. 

2. Better tests, in terms of in- 
creased factor loadings, seem to have 
been developed for the factors of 
perceptual speed and spatial relations. 
Two previously developed experi- 
mental visualization tests were found 
to have higher loadings in that factor 
than had formerly been supposed. 

3. Tests designed as improved 
measures of the factors of length 
estimation, space IT, and visual mem- 
ory seemed to be inferior to previous 
ones for the same factors. 

4, Certain hypotheses concerning 
the nature of the spatial-relations and 
visualization factors were supported: 
(a) that the AAF spatial-relations 
factor is an ability to perceive rela- 
tions of objects in space with respect 
to the observer's body, an orienta- 
tion in which the human body is the 
frame of reference; and (b) that 
visualization is the ability to manipu- 
late visual objects mentally. 

5. The usual reasoning factors 
failed to separate, probably because 
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of an insufficient number and variety 
of definitive reasoning tests in the 
battery. 

6. The effort to reproduce the 
factorial composition of a psycho- 
motor test (Discrimination Reaction 
'Time) in printed forms failed almost 
completely. The measurement of 
most of the factorsinvolved, however, 
has been achieved in various other 
printed tests. 

7. There is some indication of a 
new space factor in which orienta- 
tion depends upon the compass points 
asa frame of reference. This hypothe- 
sis is worth serious investigation. 


Post World War II Studies 


Many of the results from the war- 
time studies with aptitude and in- 
telligence tests were of interest to 
postwar programs. Guilford (C1952b, 
C1953b, C1953c, C1953d) published 
a series of articles concerning the 
nature of thinking abilities in various 
journals that are the organs of many 
professional groups. 

The results of the investigations 
carried out in the AAF Aviation 
Psychology Program were not clear- 
cut for a number of the aptitude 
areas. One of these domains was that 
of spatial abilities. Although it ap- 
peared that there were two distinct 
abilities, visualization and spatial 
orientation, further verification was 
needed. Michael, Zimmerman, and 
Guilford (C1950, C1951) showed that 
the hypotheses of two independent 
factors held for a college sample and 
for separate male and female high 
school samples. Michael, Guilford, 
Fruchter, and Zimmerman (C1957) 
reviewed the research findings in the 
spatial-visualization domain, gave 
definitions, and recommended marker 
tests for the factors of spatial rela- 
tions and orientation, visualization, 
and kinesthetic imagery. 

There was considerable interest in 


and demand for aptitude tests of well- 
established factorial content in post- 
war research. A battery of such tests 
was prepared and published by Guil- 
ford and Zimmerman (D1947). Its 
use for guidance and other purposes 
was described by Guilford (C1956b) 
in one of a series of articles on multi- 
factor test batteries. 


University of Southern California 
Studies of Aptitudes of High-Level 
Personnel 


On his return to the University of 
Southern California faculty after his 
war service, Guilford initiated a 
series of studies of aptitudes of high- 
level personnel. Major support for 
these studies was derived froma series 
of contracts from the Office of Naval 
Research (ONR). The first in the 
series was a factor analytic study of 
reasoning abilities (Guilford, Comrey, 
Green, & Christensen, B1950; Green, 
Guilford, Christensen, & Comrey, 
C1953). Reasoning was one of the 
areas in which the results of the work 
done during the war were not clear- 
cut. The general procedure followed 
in this and in other studies was to 
review previous hypotheses of the 
factors, to propose an improved and 
more inclusive set of hypotheses, to 
construct tests especially designed to 


embody these hypotheses, to admin- ' 


ister the tests, and to factor analyze 
the results. Next one must review 
and revise the adopted hypotheses in 
the light of the interpretation and 
evaluation of the factor analytic re- 
sults. This procedure may be termed 
hypothesis verification through suc- 
cessive factor analysis. In this first 
investigation of a long series of apti- 
tude studies the reasoning battery of 
tests which was administered to a 
sample of officer candidates and air 
cadets yielded 12 interpretable fac- 
tors, 7 of which were in the reasoning 
domain. 
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Another battery of 54 reasoning 
tests was assembled and administered 
to a sample of Air Force (AF) student 
officers and air cadets. The purposes 
of this study were to link together the 
batteries and results from the previ- 
ous reasoning analysis, the 1947 AF 
Aircrew Classification Battery, and a 
similar study being done at the Uni- 
versity of North Carolina (Guilford, 
Green, Christensen, Hertzka, & Kett- 
ner, C1954). Five of the factors 
found in the previous analysis were 
verified but one of them, logical 
reasoning (deduction), was considered 
to belong more properly in the evalua- 
tive rather than in the reasoning do- 
main. Only one of the factors, general 
reasoning, was represented in the air- 
crew battery tests. 

Guilford, Christensen, and Kettner 
(C1956) summarized their conclu- 
sions concerning the nature of the 
general reasoning factor by pointing 
out that it is easier to say what it is 
not, rather than what it is. Many 
hypotheses—such as those suggesting 
that it represents general intelligence, 
ability to manipulate symbols, ability 
to solve problems—were considered 
and rejected. Implications of a posi- 
tive nature were that general reason- 
ing has something to do with com- 

_prehending or structuring problems 
of certain kinds in preparation for 
solving them. The range of the prob- 
lems and extent of the ability still 
remained to be determined. 

In addition to reasoning, other 
aptitude areas of importance to high- 
level personnel that were investigated 
were evaluation, planning, verbal 
fluency, and creative thinking. 
Hertzka, Guilford, Christensen, and 
Berger (C1954) prepared a battery of 
47 tests, 36 of which were experi- 
mental ones of evaluative abilities. 
This battery was administered to a 
sample of 397 air cadets and student 
officers. Six factors in the evaluation 


domain were identified, three of which 
had been previously known. The 
expected and previously confirmed 
judgment factor was not isolated as a 
separate dimension. Its variance 
seemed to be distributed among sev- 
eral of the evaluation factors. 

Another large-scale project was the 
investigation of abilities in the plan- 
ning domain (Guilford, Berger, & 
Christensen, B1954, B1955). Thirty- 
two experimental tests were con- 
structed. A battery of 52 tests was 
administered to a sample of 364 AF 
aircrew trainees. Of the 14 factors 
identified, 4 of them—ordering, elabo- 
ration, perceptual foresight, and con- 
ceptual foresight—were unique to 
planning tests. 

A study of flexibility in thinking by 
Frick, Guilford, Christensen, and 
Merrifield (C1959) revealed two hy- 
pothesized factors: spontaneous flexi- 
bility and adaptive flexibility. The 
nature of the tests which had loadings 
on these factors resulted in somewhat 
revised hypotheses of what the fac- 
tors represented. An interesting side- 
light was that the Water Jar test did 
not load on either of the flexibility 
factors, as expected from an earlier 
study by Frick and Guilford (C1957), 
but had variance on the logical 
evaluation and general reasoning 
factors. f 

A new approach was used by Guil- 
ford and Christensen (B1956) to 
carry out a definitive factor analytic 
study of the much investigated but 
still ambiguous area of verbal fluency. 
Hypotheses were formulated in terms 
of differences between fluency factors. 
These differences were represented 
between tests in sets of two or more 
otherwise parallel tests. A battery of 
41 variables was administered to a 
sample of 221 Naval Air Cadets. The 
essential features of the factors were 
easily distinguishable by the nature 
of the tests, and modified definitions 
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of the verbal fluency factors were 
pro 


Structure of Intellect 


Experience in working with a wide 
variety of intellectual tasks and fac- 
tors led Guilford to develop a model 
for the structure of intellect. His first 
attempt to organize the known intel- 
lectual factors into a system was con- 
tained in a paper presented to an 
international conference on factor 
analysis in Paris in July 1955 (Guil- 
ford, C1956a). Modifications of the 
model were reported by Guilford 
(C1956d, Ci956e, B1957, C1958a, 
C1958e) to various groups as they 
were developed. The current form of 
the model was presented by Guilford 
in his book Personality (Guilford, 
A1959) and in the Walter Van Dyke 
Bingham Memorial Lecture (Guil- 
ford, C1959b). It has been used asa 
basis for a consideration of the prob- 
lems of curriculum (Guilford, 
C1958d), in relation to the teaching 
of reading (Guilford, C1960c), and as 
a basis for a systematic orientation 
with respect to psychological tests 
(Guilford, Fruchter, & Kelley, 
C1959), 

The structure of intellect in its 
current form is represented by means 
of a three-dimensional rectangular 
solid, as shown in Figure 1. With the 
large number of primary factors 
identified it became important to find 
some underlying, unifying principles 
that would make comprehension of 
the total list possible. The first 
principle of classification used is con- 
tent, or the kind of materials dealt 
with by the individual. Four types of 
contents are distinguished. These 
content categories may be viewed as 
stimuli to which a person responds or 
as general varieties of information— 
materials requiring discrimination by 
the organism. Three of the categories 


refer to known intellectual abilities; 
and the fourth one, behavioral, is 
brought in to include the abilities to 
deal with people. 

Five types of operations can be 
performed on each of the four types 
of materials or contents. These opera- 
tions represent inferences as to the 
major kinds of intellectual activities 
or processes that the organism em- 
ploys in handling the raw materials 
of information. 

The third major principle of classi- 
fication of the primary intellectual 
abilities is in terms of the kinds of 
products achieved by the different 
kinds of operations applied to the 
different kinds of content. These 
products which constitute the results 
arising from the processing of informa- 
tion by the organism may be thought 
of as categories of responses. 

Putting aside behavioral content, 
there should be 3X5X6—or 90— 
primary intellectual abilities, about 
60 of which have been isolated thus 
far. Guilford and Merrifield (B1960) 
have listed the two or three tests 
which best define each known factor. 
The model is also used to direct the 
search for additional factors that may 
seem to fill the vacant cells (Guilford, 
Merrifield, Christensen, & Frick, 
C1961). 

In addition to the figural, symbolic, 
semantic, and behavioral content 
areas, Guilford (C1958h) has pro- 
vided a structure for the factors in the 
psychomotor domain. Seven types of 
psychomotor ability are taken to ac- 
count for the variance in psycho- 
motor performances, and these can 
be distinguished further by the part 
of the body involved. A 7X5 or 35- 
cell matrix emerges, about half the 
cells of which are occupied by identi- 
fied factors. 

Guilford (C1961a) recognizes three 
types or levels of factor models. One 
of these consists of primary factors, 
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Fic. 1. Theoretical model for the complete “structure of intellect.” 


in which a number of independent 
coordinate traits are recognized. In 
another, traits are organized into 
hierarchical systems with the order 
corresponding to the level of the fac- 
tors. The third type of model is in 
matrix form such as is represented by 
the structure of intellect. 

Guilford and Merrifield (B1960) 
have pointed out that the matrix- 
type model is essentially a morpho- 
logical analysis of the basic patterns 
in a universe of information. It is 
usually possible to decide what the 
basic variables are and to recognize 
the logically possible classes or cate- 
gories of these variables. That the 
structure-of-intellect model may be 
supported as theory requires two 
types of verification. First, previ- 


ously found factors must be con- 
firmed as distinct from each other 
terms of their 


existences verified. 

Various implications of factors for 
psychological theory have been drawn 
by Guilford (C1961a). It is strongly 
suggested by the structure-of-intel- 
lect model that a human being 
should be conceived of as an instru- 
ment or agent whose psychological 
purpose is to deal with information. 
Information is defined as anything an 
individual discriminates. Association 
and learning phenomena are also 
readily related to the structure-of- 
intellect concepts. 


ps 


An important area of postwar re- 
search has been concerned with the 
psychology of creativity. In his 
Presidential Address to the American 
Psychological Association, Guilford 
(C1950) was one of the first to sound 
the clarion call to investigation in 
this field. He pointed out the impor- 
tance of the area and the paucity of 
work in it. In connection with his 
project on the aptitudes of high- 
level personnel he was able to initiate 
and to carry on an extensive program 
in creativity research. Hypotheses 
were developed concerning the abili- 
ties—such as fluency, flexibility, and 
sensitivity to problems which distin- 
guish the creative person—and psy- 
chometric measures were constructed 
to embody these hypotheses. Twelve 
years later, Guilford (C1962a) pre- 
sented an exposition of the area of 
creativity from the standpoints of its 
measurement and its development. 

In a study of individual differences 
in originality (Wilson, Guilford, & 
Christensen, C1953, C1962), three 
criteria of scoring test responses for 
originality were applied: uncommon- 
ness, remoteness, and cleverness, In 
an analysis of the test battery, follow- 
ing its administration to a sample of 
410 air cadets and officers, five of the 
seven tests scored for originality ap- 
peared on a common factor tenta- 
tively described as representing origi- 
nality. Other new factors that were 
identified were adaptive flexibility, 
spontaneous flexibility, redefinition, 
and sensitivity to problems (Wilson, 
Guilford, Christensen, & Lewis, 
C1954), 

To consolidate and to extend the 
findings made in the combined areas 
of reasoning, creativity, and evalua- 
tion, a battery of 52 tests, most of 
them selected from previous studies, 
were assembled and alternative hy- 
potheses were listed for most of them 
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(Guilford, Kettner, & Christensen, 
C1959). The tests were administered 
in small overlapping batteries to three 
samples of air cadets. Clarification of 
the nature of a number of factors, 
including sensitivity to problems, 
associational fluency, originality, and 
judgment was obtained. 

Certain special aspects of creativity 
were investigated. Wilson, Guilford, 
and Christensen (C1957) found a 
relatively constant rate for the pro- 
duction of inventive tasks, as com- 
pared with a diminishing rate for 
simple recall tasks. Two of the three 
aspects of originality studied—un- 
commonness and remoteness in- 
creased with working time, while the 
third—cleverness—was independent 
of time. Although under the instruc- 
tions to be clever there was a decrease 
in the total number of responses, there 
was an increase in the number of 
clever responses as well as in the 
average degree of cleverness. 

An interesting approach was taken 
to determine the opinions of out- 
standing scientists and engineers con- 
cerning the relative importance of 
abilities in creative thinking (Allen, 
Guilford, & Merrifield, B1960). 
Twenty-eight of the most pertinent 
factors were described, and 35 high- 
ranking scientists from various parts 
of the country were asked to sort 
them into seven ranked categories. A 
similar ranking was made by a group 
of 50 nonscientists. There was con- 
siderable agreement in the rankings 
of the two groups. Divergent-pro- 
duction factors were not rated so 
highly as had been expected. The 
scientists rated factors in the product 
category of transformations highest, 
particularly the redefinition factors, 
It was concluded that the transforma- 
tion factors deserve recognition as 
important potential contributors to 
creative work. 


Guilford (C1956d, C1958a,C1959a) 
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reported his work at a series of con- 
ferences on research in creativity. He 
was also concerned about the role of 
creativity in the arts (Guilford, 
C1954b, C1957a, C1958b, C1960f) 
and in education (Guilford, C1958d, 
C1960d, C1962b, C1962d; Guilford, 
Merrifield, & Cox, B1961). 

A number of the tests used in the 
studies of aptitudes of high-level 
personnel have been published by 
Christensen and Guilford (D1955, 
D1958a, D1958b, D1958c, D1958d, 
D1959), Hertza and Guilford 
(D1955), Christensen, Guilford, and 
Merrifield (D1958), Berger and Guil- 
ford (D1960), and Wilson, Merrifield, 
and Guilford (D1961). 


CONTRIBUTIONS TO THE FIELD OF 
PERSONALITY 


Guilford's many outstanding con- 

tributions to the field of personality 
measurement over the past three 
decades have brought him recognition 
both at home and abroad. In his 
recent book, The Structure of Human 
Personality, for example, Eysenck 
(E1960) says, 
Compared to Guilford's patient, long-con- 
tinued, and fruitful work, most of the other 
researches in this field must be regarded as 
being of comparatively little interest, except 
in so far as they confirm or fail to confirm 
Guilford's findings (p. 191). 


These contributions began with an 
important series of investigations 
into the dimensionality of the con- 
cept Introversion-Extraversion. The 
Nebraska Personality Inventory 
(Guilford, D1934), An Inventory of 
Factors STDCR (Guilford, D1940b), 
the Guilford-Martin Inventory of 
Factors GAMIN (Guilford & Martin, 
D1943a), the Guilford-Martin Per- 
sonnel Inventory (Guilford & Martin, 
1943b), and the Guilford-Zimmerman 
Temperament Survey (Guilford & 
Zimmerman, 1947) followed. These 
inventories were backed up by a 


number of careful research studies, 
Important studies and tests in the 
area of interests and needs and a book 
on personality round out the picture 
of Guilford's major contributions in 
the field of personality. 


Studies of Introversion-Extraversion 


Guilford's first interest in the field 
of personality developed in connec- 
tion with his studies of the highly 
popular concept of Introversion- 
Extraversion (Guilford, C1934b; 
Guilford & Hunt, C1931). Several 
early inventories had been developed 
to measure this trait, but their inter- 
correlations ranged from .19 to .62, 
in no case high enough to suggest that 
two different tests were measuring 
the same trait. Thus, the develop- 
ment of trait measures without bene- 
fit of careful scientific control seemed 
to meet with something less than 
complete success. 

In an attempt to provide a defini- 
tive demonstration of this phenone- 
non, Guilford and his wife undertook 
an investigation designed to test the 
hypothesis that Introversion-Extra- 
version is a unifactor dimension 
(Guilford & Guilford, C1934). This 
investigation, which attracted wide- 
spread attention, represented one of 
the really pioneering efforts to intro- 
duce empirical quantitative analysis 
and thinking into an area hitherto 
dominated by armchair thinking and 
speculative philosophy. In that 
study, 35 questionnaire items were 
selected which represented nonidenti- 
cal aspects of Introversion-Extra- 
version according to its acknowledged 
champions. That is, items were 
selected which constituted measures 
of Introversion-Extraversion accord- 
ing to recognized authorities while 
items were eliminated which appeared 
so similar in content as to be merely 
alternate forms of the same question. 
Although some writers have pointed 
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out that many more than 35 mani- 
festations of Introversion-Extraver- 
sion exist; clearly, if a unifactor 
hypothesis fails for this subset, it 
fails for the total set. The 35 aspects 
of Introversion-Extraversion were 
formulated into "yes-no" questions 
and were balanced for certain re- 
sponse-set effects—a caution, thus 
anticipating much recent work. With 
characteristic thoroughness that typi- 
fies Guilford, these questions were 
administered to a sample of almost 
1,000 subjects. In a second and more 
comprehensive statistical analysis of 
these data (Guilford & Guilford, 
C1936), tetrachoric intercorrelations 
were obtained among the 35 test 
items and a centroid factor analysis of 
the matrix was carried out. This 
utilization of items as variables in 
factor analytic work anticipated the 
most recent developments in refined 
quantitative personality test research. 
Analysis of total scores over pools of 
unanalyzed items unfortunately has 
been utilized by many investigators 
who have failed to appreciate the 
wisdom of Guilford’s more nearly 
exact approach. In applying the 
factor analytic method described by 
L. L. Thurstone (E1934) in his newly 
published book, Vectors of Mind, 
Guilford showed quick appreciation 
of the possibilities of a powerful new 
statistical tool. 

A single factor proved to be in- 
sufficient to account for the inter- 
correlations among these 35 items. 
In fact five meaningful factors were 
utilized to account for the bulk of the 
common variance: Social Introver- 
sion, Emotionality, Masculinity, 
Thinking Introversion, and Rha- 
thymia. Since subsequent test meas- 
ures of these five factors did prove to 
be correlated, overlap does exist 
among certain measures of Intro- 
version-Extraversion. Guilford'sworl 
made it clear, however, that Intro- 


version-Extraversion represents a 
composite of correlated primary fac- 
tors rather than a single dimension as 
was once commonly supposed. 


Nebraska Personality I; nventory 


To build better scales for the meas- 
urement of the three dominant fac- 
tors in this first factor analysis, more 
than 80 new items were developed to 
measure Social Introversion, Emo- 
tionality, and Masculinity. These 
new items, along with the 35 original 
items, were administered to another 
large sample. In this instance a tt 
category of response, as well as the 
usual yes and no answers was per- 
mitted. Scores were developed for 
each of the three factors through using 
those items which had substantial 
loadings for the factor. The highest 
and lowest scoring quarters of the 
subjects for each factor were selected 
to constitute criterion groups. The 
items were then correlated with these 
three criteria. Approximately 100 
items proved to be suitable for the 
measurement of these three factors, 
giving rise to the Nebraska Personal- 
ity Inventory, 

In an attempt to reduce the inter- 
correlations among scales designed 
to measure the three factors—Social 
Introversion, Emotionality, and Mas- 
culinity—the Guilfords selected those 
items which had proved to be the 
most nearly factorially pure measures 
of these dimensions. Another ques- 
tionnaire was developed including 
new items to measure the other two 
factors found in the first analysis, 
Rhathymia and Thinking Introver- 
sion, as well as the three factors 
already measured in the Nebraska 
Personality Inventory (Guilford & 
Guilford, C1939a). These 89 items 
were administered to a large sample 
and 30 items were selected for fac- 
torial analysis. Tetrachoric inter- 
correlations among the items were 
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factor analyzed by the centroid 
method with the extraction and rota- 
tion of nine factors. The best defined 
of these factors proved to be the three 
previously isolated factors: Thinking 
Introversion, Social Introversion, and 
Rhathymia; and two newly identified 
factors, Depression and Alertness. 
One further joint study by the 
Guilfords resulted in the definition of 
two additional personality factors, 
Nervousness and General Drive (Guil- 
ford & Guilford, C1939b). This in- 
vestigation, like the first one, was 
designed to test a specific hypothesis, 
an important feature of the well de- 
signed factor analytic study; namely, 
that there is a single real variable of 
Hyperactivity-Hypoactivity. This 
hypothesis was prompted by the 
theory of G. L. Freeman concerning 
the physiological basis of tempera- 
mental differences. | Twenty-four 
items were formulated to assess the 
hypothetical factor of Hyperactivity- 
Hypoactivity and combined with 
other items to make a 100-item ques- 
tionnaire which was administered to 
600 students. Seven centroid factors 
were extracted from the matrix of 
tetrachoric coefficients among the 24 
items. Rotation to simple structure 
resulted in two clear-cut factors, The 
first of these, Nervousness, was rep- 
resented by the jumpy, hypertense, 
and nervous individual. The second, 
called General Drive, described the 
fast-moving, dynamic, hurried indi- 
vidual. These factors which seemed 
independent of one another dem- 
onstrated that the a priori notion 
of Hyperactivity-Hypoactivity had 
oversimplified the actual situation. 
A similar result had been obtained 
earlier with investigations of the 
Introversion-Extraversion concept. 


Guilford Factorial Inventories 


Following these pioneering back- 
ground studies, several inventories 


were developed over the years by 
Guilford and his collaborators, cul- 
minating with the Guilford-Zimmer- 
man Temperament Survey. The first 
of these questionnaires, An Inventory 
of Factors STDCR (Guilford, 
D1940b), was designed to cover the 
factors suggested by the various 
analyses concerned with the Intro- 
version-Extraversion concept. "The 
175 questions of the "Yes ? No" type 
produce scores on the 
variables: S—Social pem j 
shyness and withdrawal tendencies; 
T—Thinking Introversion, ' 
tive, philosophizing tendency toward 
analyzing self and others; D—Depres- 
sion, feelings of unworthiness and 
guilt; C—Cycloid, tendency toward 
fluctuations in mood, instability, and 
strong emotional reactions; R—Rha- 
thymia, carefree disposition, liveli- 
ness, and impulsiveness. Reliabilities 
for these scales were estimated to be 
between .89 and .92. Norms were 
provided separately for senior high 
school students, university students, - 
and adults. 

The Guilford-Martin Inventory of 
Factors GAMIN (Guilford & Martin, 
D1943a) appeared next. In the 
original edition of this test, there were 
270 items and a difficult scoring 
method was employed. Using inter- 
nal consistency item-analysis meth- 
ods, the inventory was reduced to 186 
items in the abridged edition pub- 
lished 2 years later. Reliabilities for 
the abridged edition were generally 
in the eighties. Scores for the follow- 
ing traits were obtained: G—Gen- 
eral Activity, great pressure for overt 
activity; A—Ascendance-Submission, 
dominance in social situations, lead- 
ership; M—Masculinity-Femininity, 
masculinity of interests and attitudes; 
I—Inferioity Feelings, self-confi- 
dence and lack of inferiority feelings: 
N—Nervousness, lack of nervousness 
and irritability. 3 
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The third questionnaire to appear 
was the Guilford-Martin Personnel 
Inventory (Guilford & Martin, 
D1943b). Aside from covering certain 
factors not already included in the 
other two inventories, this test was 
intended to assist supervisors of 
workers in business and industry to 
single out and to diagnose those 
individuals who are personally mal- 
adjusted. In particular, items were 
selected which might be diagnostic of 
(a) subjectivity, a taking of things 
personally, having ideas of reference, 
or exhibiting touchiness; (b) belliger- 
ence, a domineering attitude, a crav- 
ing for superiority; (c) suspiciousness; 
and (d) fault-finding or hypercritical- 
ness. The method of selecting items 
was typical of the approach used by 
Guilford in questionnaire develop- 
ment. Two hundred items were given 
to 500 industrial and business em- 
ployees, divided equally by sex. A 
priori keys were developed through 
use of the best available trait knowl- 
edge to provide preliminary measures 
of the variables, Four hundred papers 
were scored on these a priori keys for 
each variable, and the top and bottom 
100 scores were identified. Items 
were retained which correlated highly 
with the dichotomy between high and 
low scores for the particular variable 
in question. The remaining 100 cases 
provided data for checking reliability, 

The three variables that survived 
the item analysis procedure gave rise 
to a 150-item test with the following 
scores: O—Objectivity versus Sub. 
jectivity, or hypersensitivity; Ag— 
Agreeableness versus Generalized 
Hostility, or belligerence; and Co— 
Cooperativeness, or tolerance versus 
fault finding disposition. A profile 
chart for use with these three in- 
ventories was published shortly after- 
wards (Guilford & Martin, D1945). 

Although a great deal of work had 
gone into the development of these 


inventories, especially with the la- 
borious methods of calculation avail- 
able at that time, it was inevitable 
that revisions would be needed. In 
the first place, the exigencies of calcu- 
lation made it impossible to develop 
all these variables in one giant analy- 
sis. Overlapping among some of the 
scales, therefore, was inevitable. Ac- 
cordingly, taking note of experience 
in extensive applications of these 
inventories, Guilford and his col- 
league, W. S. Zimmerman, developed 
a new inventory, the Guilford-Zim- 
merman Temperament Survey, de- 
signed to measure the 10 most impor- 
tant and most nearly independent 
factors available from all the previous 
inventories discussed (Guilford & 
Zimmerman, D1949). This new in- 
ventory consisted of 300 Yes ? No 
items divided equally among the 10 
traits. No item was scored on more 
than one scale. This important 
precaution has been recommended 
by Guilford to others who construct 
personality inventories. 

The traits for which scores were 
obtained on the Guilford-Zimmerman 
Temperament Survey were as fol- 
lows: G—General Activity; R—Re- 
straint versus Rhathymia; A—As- 
cendance; S—Sociability, or the op- 
posite of Social Introversion; E— 
Emotional Stability, a combination 
of D and C which proved to be too 
highly correlated; O—Objectivity; 
F—Friendliness, formerly Ag; T— 
Thoughtfulness, formerly Thinking 
Introversion; P—Personal Relations, 
formerly Co; and M—Masculinity of 
emotions and interests, Norms were 
obtained on a college population of 
523 men and 389 women for all the 
traits except T where high school 
seniors and their parents were used, 
The estimated reliabilities of the 
scores ranged from .75 to .85. Inter- 
correlations between the traits in the 
norm group were generally small, 
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although several appeared in the 
region of .4 and one around .6. De- 
tailed information was provided in 
the manual for interpreting scores 
and patterns of scores. Several scales 
for determination of the validity of 
responses have been developed for use 
with the test. 

Meanwhile, two investigations by 
other authors raised some doubt as to 
whether indeed all 13 of the factors 
were necessary to describe this par- 
ticular portion of the personality 
hyperspace (Lovell, E1945; Thur- 
stone, E1951). After all, only 10 
factors had been included even in the 
Guilford-Zimmerman Temperament 
Survey. Lovell intercorrelated the 13 
inventory scores STDCR, GAMIN, 
and O, Ag, Co. Using estimated com- 
munalities, she extracted and rotated 
six factors. Some textbook writers 
have leaped to the mistaken conclu- 
sion that these six factors could be 
used to measure what was being as- 
sessed in the entire 13 scales. Since 
only the variance in common among 
these measures was analyzed, such a 
conclusion is patently false. In addi- 
tion to this, the intercorrelations 
among the original scales were spuri- 
ously high, because of the presence of 
overlapping specific variance from the 
use of the same items on more than 
one scale. This practice was aban- 
doned later in the Guilford-Zimmer- 
man Temperament Survey. 

In Thurstone's study, which used 
Lovell's correlation matrix, reliability 
estimates from the test manual were 
introduced in order to analyze the 
total true variance in the inventory 
scores instead of just the common 
factor variance among the scores. 
This would permit all 13 factors to 
emerge if conditions were proper. 
However, this procedure does not 
obviate the difficulty created by the 
inflated correlation coefficients. A 
further problem was introduced by 


the fact that the reliability coeffi- 
cients were obtained for a sample 
different from that for which the cor- 
relations were obtained. Thurstone's 
conclusion that nine factors are suffi- 
cient to account for the variance 
among the 13 inventory scores thus 
remained open to question. 


Thirteen Factors of Temperament 


No adequate single study had been 
conducted in which all the 13 factors 
were properly represented. When 
doubts were raised by these two 
studies about the dimensionality of 
the space spanned by the 13 scores, 
Guilford and Zimmerman undertook 
a large scale analysis to provide a 
definitive answer to the question. It 
was possible, fortunately, to use the 
original test records gathered by 
Lovell for her study and thus to 
avoid any question about a lack of a 
comparable basis for the conclusions 
because of different data (Guilford & 
Zimmerman, B1956). In order to 
reduce the correlation matrix to 
manegeable size, items for the 13 fac- 
tors were divided into homogeneous 
small pools on the basis of inspection 
of content and previous available 
statistical information about the 
items. This was done independently 
by two psychologists. Where discus- 
sion of differences on the categoriza- 
tion of an item failed to eliminate the 
disagreement, the item was thrown 
out. Sixty-nine pools of items were 
formed in this way, the pools varying 
from a minimum of 2 items to a 
maximum of 11. Each of these item 
pools was allocated by hypothesis to 
one of the 13 factors, three pools con- 
stituting the minimum number allo- 
cated to define any given factor, and 
seven pools the maximum number. 

Total scores over these 69-item 
pools were obtained, with „scoring 
weights of one and zero being em- 
ployed. Although most item pools 
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were scored in the same direction as 
the inventory variables from which 
they were obtained, some score 
variables were reflected. To these 69 
variables was added the one of sex 
dichotomy to form a total of 70 vari- 
ables for a factor analysis that was 
based on a sample of 126 men and 87 
women. The distributions for all 
variables were dichotomized, and 
tetrachoric coefficients of intercorre- 
lation among the variables were esti- 
mated by the cosine-pi approxima- 
tion. Eighteen factors were extracted 
by the centroid method and rotated 
to meaningful orthogonal positions 
with positive manifold and simple 
structure used as the major criteria. 
Rotation revealed that 14 of the 18 
factors could not be considered re- 
siduals. The need for at least 13 
factors to cover the psychological 
domain represented by the three 
original inventories was confirmed. 

All 13 of the original factors were 
verified for the most part, although 
with slightly different characteristics 
in some cases. A fourteenth factor 
appeared with some of the properties 
formerly associated with Factors C 
and R. These results left no doubt 
that at least 13 factors were needed 
to account for the common variance 
among these questionnaire variables, 
Since these factors represent the cul- 
mination of 25 years’ effort by Guil- 
ford and his collaborators, it will be 
appropriate to provide a brief de- 
scription of each one. The high- 
scoring person on the factor is de- 
scribed, unless otherwise noted: 

1. G—General Activity: energetic, 
rapid-moving, rapid-working person 
who likes action and may sometimes 
be impulsive. 

2. A—Ascendance: the person who: 
upholds his rights and defends him- 
self in face-to-face contacts: does not 
mind being conspicuous, in fact may 
enjoy it; through social initiative, 


gravitates to positions of leadership; 
is not fearful of social contacts, is not 
inclined to keep his thoughts to him- 
self. There is little to indicate that 
“submission” accurately describes 
the negative pole, as was formerly 
believed. 

3. M—Masculinity versus Fem- 
ininity: has masculine interests, voca- 
tional and avocational; not emo- 
tionally excitable or expressive; not 
easily aroused to fear or disgust; 
somewhat lacking in sympathy. 

4. I—Confidence versus Inferior- 
ity Feelings: feels accepted by others, 
confident, and adequate; socially 
poised ; satisfied with his lot, not self- 
centered. 

5. N—Calmness, Composure ver- 
sus Nervousness: calm and relaxed 
rather than nervous and jumpy; not 
restless, easily fatigued, or irritated; 
can concentrate on the matter at 
hand. 

6. S—Sociability: likes social ac- 
tivity and contacts, formal or infor- 
mal; likes positions of social leader- 
ship; has social poise; not shy, bash- 
ful, or seclusive. 

7. T—Reflectiveness: given to 
meditative and reflective thinking; 
dreamer; Philosophically inclined; 
has curiosity about and questioning 
attitude toward behavior of self and 
others, 

8. D—Depression:emotionally and 
physically depressed rather than 
cheerful; given to worry and anxiety 
and to perseverating emotions and 
changeable moods, 

9, Ci—Emotionality: emotions 
easily aroused and perseverating, yet 
shallow and childish; daydreamer 
(not identical with Factor C—Cy- 
cloid). 

10. R—Restraint versus Rha- 
thymia: self-restrained and self-con- 
trolled; serious minded rather than 
happy-go-lucky; not cheerfully irre- 
sponsible. 
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11. O—Objectivity: takes an ob- 
jective, realistic view of things; alert 
to his environment and can forget 
himselí; not beset with suspicions, 
hypersensitivity, unwarranted sym- 
pathies, anxiety, or feelings of guilt. 

12. Ag—Agreeableness: low-scor- 
ing individual is easily aroused to 
hostility; resists control by others; 
has contempt for others and may be 
aroused to aggressive action. High 
scoring person is friendly and com- 
pliant. 

13. Co—Cooperativeness, Toler- 
ance: low-scoring person is given to 
critical fault-finding generally; has 
little confidence or trust in others; 
self-centered and self-pitying. 

These factor descriptions vary 
from the previous ones mainly in that 
Factor R lost its impulsiveness as- 
pect, opening the way for a separate 
impulsiveness factor, and Factor C 
lost its fluctuation of mood character, 
leaving a more restrictive interpreta- 
tion in the Ci factor. Intercorrela- 
tions among the factors, as deter- 
mined by a clever indirect procedure 
not requiring oblique rotations, ap- 
peared to be close to zero in most 
cases. Factor C; had a high negative 
correlation with R, and moderate 
correlations were noted between Fac- 
tors G and Ag, A and I, N and D, 
Ag and Co. Other correlations were 
small or negligible. 


Measurement of Interests 


Although Guilford's contributions 
to the measurement of interests and 
related aspects of personality have 
not been so numerous nor so domi- 
nant in the field as in the area of tem- 
perament, he is nevertheless responsi- 
ble for two tests in this area, an ex- 
tensive factorial analysis of interests, 
and additional relevant research ar- 
ticles. 

Making use of the best available 
knowledge of the factors in the inter- 


est area at the time, Guilford and his 
collaborators published in 1948 the 
Guilford-Schneidman-Zimmerman In- 
terest Survey (Guilford, Schneidman, 
& Zimmerman, D1948). Since items 
were designed for use with an absolute 
judgment scale of response, the sta- 
tistical dependency between scores 
introduced by forced-choice judg- 
ments in certain other interest in- 
ventories was obviated. The survey 
employs nine main scoring categories, 
each divided into two subcategories: 
Artistic, Appreciative and Expres- 
sive; Linguistic, Appreciative and 
Expressive; Scientific, Investigatory 
and Theoretical; Mechanical, Manip- 
ulative and Designing; Outdoor, Nat- 
ural and Athletic; Business-Political, 
Mercantile and Leadership; Social 
Activity, Persuasive and Gregarious; 
Personal Assistance, Personal Serv- 
ice and Social Welfare; Office Work, 
Clerical and Numerical. 

A few years later when newly 
available computing facilities made it 
feasible, Guilford, and three of his 
students, undertook the most exten- 
sive factor analytic study ever car- 
ried out in the interest area (Guilford, 
Christensen, Bond, & Sutton, B1954). 
This study followed the classic design 
for which Guilford is so well known: 
(a) develop primary hypotheses about 
the nature of the factors; (b) formu- 
late subsidiary hypotheses about the 
factors which lead to the development 
and selection of variables to go into 
the factor analysis; (c) construct a 
test to measure each subsidiary hy- 
pothesis; (d) factor analyze the scores 
obtained, including rotation to psy- 
chologically meaningful factor posi- 
tions; and (e) interpret the obtained 
factors in relation to the hypotheses 
and tests. 

Defining interest broadly enough 
to include a coverage of motivational 
variables not usually treated in inter- 
est inventories, Guilford and his col- 
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laborators culled previous studies, 
considered various treatises on moti- 
vation, and conducted field interviews 
to develop 33 major factor hypothe- 
ses in this area. Subhypotheses num- 
bering from one to six, each reflecting 
aspecific feature, variation, or alterna- 
tive conception of the hypothesized 
major factor, were formulated for 
each major factor. After item analy- 
ses for internal consistency and 
screening for suitable distribution 
characteristics, 1,000 items were se- 
lected to provide 100 subtests, each 
consisting of 10 Yes ? No items. 

After administration of the test to 

large groups of AF airmen and officer- 
level personnel, 95 variables were 
selected for factor analysis, separately 
in the two groups of subjects, elimi- 
nating five unsatisfactory inventory 
variables. Factors were extracted by 
the centroid method, and orthogonal 
rotation of the two centroid matrices 
was executed through use of simple 
structure and psychological mean- 
ingfulness as guides. Seventeen of 
the factors were sufficiently similar in 
the two analyses to warrant the same 
name: Mechanical Interest, Scientific 
Interest, Adventure versus Security, 
Social Welfare, Esthetic Apprecia- 
tion, Cultural Conformity, Self-Re- 
liance versus Dependence, Esthetic 
Expression, Clerical Interest, Need 
for Diversion, Autistic Thinking, 
Need for Attention, Resistance to 
Restriction, Business Interest, Out- 
door-Work Interest, Physical Drive, 
and Aggression. The remaining five 
or six factors in the two analyses 
could not be equated. Of the factors 
just mentioned, all had been hy- 
pothesized prior to the analysis except 
Self-Reliance, Autistic "Thinking, and 
Resistance to Restriction. 

This investigation was noteworthy 
in testing the unity of several known 
interest factors in the presence of a 
generous number of motivation vari- 


ables. The previously well known 
Mechanical, Scientific, Social Wel- 
fare, Esthetic Expression, Clerical, 
and Business Interest factors stood 
this test very well. The study also 
showed, however, that no existing 
inventory of basic-interest score cate- 
gories gives sufficient coverage of the 
interest domain, when that domain 
includes all the human urges to ac- 
tivity that have vocational implica- 
tions. An immediate consequence of 
this large study of interests was the 
development of the DF Opinion 
Survey (Guilford, Christensen, & 
Bond, D1956) to measure several 
hormetic variables suggested by that 
research, namely: Need for Attention, 
Need for Freedom, Need for Diver- 
sion, Liking for Thinking, Liking for 
Adventure versus Security, and Cul- 
tural Conformity, 

More recently, an investigation 
was carried out in which many of the 
factors found in the large interest 
analysis were studied in connection 
with some variables designed to meas- 
ure interest in different kinds of 
thinking (Guilford, Christensen, 
Frick, & Merrifield, C1961). Two of 
the three interest-in-thinking factors 
from the previous study were verified, 
and two additional interest-in-think- 
ing factors emerged as a result of the 
development of new hypotheses and 
of corresponding test variables, These 
two new factors, Interest in Con- 
vergent Thinking and Interest in 
Divergent Thinking, are Particularly 
noteworthy because they parallel 
two important classes of thinking- 
ability factors discovered in Guil- 
ford's laboratory, 

In a companion study, correlations 
were derived between 24 primary 
factors of need, interest, and tempera- 
ment on the one hand and 13 factors 
of fluency, flexibility, and originality 
on the other (Merrifield, Guilford, 
Christensen, & Frick, C1961). This 
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investigation was inspired by the 
suspicion that some factors being 
discovered in aptitude test batteries 
might be of a nonaptitude character, 
Using factor measures discovered in 
previous studies, an estimate was 
derived for the correlation between 
the composite of the variables used to 
measure each aptitude factor and the 
composite of variables used to meas- 
ure each nonaptitude factor. Most of 
the correlations proved to be close to 
zero. No correlation was higher than 
about .3, although the proportion of 
correlations considered statistically 
significant appeared to be in excess of 
chance. These results would seem to 
indicate that individual differences in 
these creative thinking factors can be 
accounted for only to a slight degree 
by means of the personality factors 
used. The authors pointed out, how- 
ever, that the test situation tended 
to restrict the possible effect of these 
motivational variables as compared 
with the real life situation. In reality, 
therefore, they may be much more 
relevant than this study would seem 
to indicate. 


Book of Personality 


The latest in Guilford's long series 
of outstanding books, entitled Per- 
sonality, is a suitable companion to 
his many research contributions in 
the field of personality (Guilford, 
A1959). Designed as a college text, 
the book nevertheless serves for the 
professional psychologist too, as an 
outstanding review of the theory and 
practice of objective personality 
measurement. The book represents 
an attempt to give a systematic 

` treatment to the field of personality 
from the standpoint of factor analysis 
applied in conjunction with experi- 
mental method. This represents a 
considerable restriction when one 
considers the plethora of written 
material in which the term "person- 


ality" appears; but in using this word 
for the title of his book, Guilford 
states in the preface, "No apologies 
are offered for choosing the compre- 
hensive appearing title. This és per- 
sonality as viewed by the author.” 

It is Guilford's contention that 
factor analysis is the most powerful 
analytic tool available for the study 
of human personality, but that the 
effective use of this method demands 
the simultaneous application of ex- 
perimental logic. ^ Unfortunately, 
this condition has not been met in 
many published factor analytic stud- 
ies, a circumstance tending unjustly 
to reflect upon the usefulness of the 
technique itself. 

The first 5 chapters in the book are 
devoted to providing the student with 
background needed to understand 
the later material Definitions of 
personality, different approaches to 
the study of personality, and some 
discussion of controversial issues 
precede a superlative treatment of 
individual differences and of personal- 
ity traits. A valuable feature of this 
material is the lucid exposition of the 
hierarchical model for trait organiza- 
tion. Lack of understanding in this 
area has produced a great deal of con- 
fusion in interpreting previous factor 
analytic results. 

Chapters 6 through 13 review 
various methods of assessing per- 
sonality traits. These methods are 
classified under the chapter headings 
of: “Morphological and Physiological 
Methods”; “Observational Methods, 
Ratings, and Interviews" ; ‘‘Personal- 
ity Inventories”; Interest and Atti- 
tude Measurement”; ‘Behavior 
Tests" ; “Expressive Methods”; ‘‘Pro- 
jective Techniques”; and other clini- 
cal methods. The emphasis in these 
chapters is upon evaluation of the 
strengths and weaknesses of the 
methods as determined by the avail- 
able evidence from research findings. 
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The last 5 chapters are devoted to 
the description of the factor structure 
of personality based upon published 
research in the following areas: 
somatic dimensions, dimensions of 
aptitude, dimensions of temperament, 
hormetic dimensions, and dimensions 
of pathology. These chapters are 
particularly valuable for the research 
psychologist as a unified picture of 
the factor analytic contributions to 
personality measurement over the 
past several decades. An interesting 
and provocative feature is Guilford's 
attempt to provide an organization 
scheme for nonaptitude factors analo- 
gous to the "structure of intellect" 
for aptitude factors. 


Brief Overview 


It is difficult to appreciate the full 
impact Guilford has had on per- 
sonality measurement because the 
character of the field today owes so 
much to his influence. It is easy to 
take for granted the present state of 
affairs without realizing how it has 
evolved. Although an enumeration 
of specific contributions will not show 
the full debt which the field owes to 
Guilford, it can nevertheless help to 
provide at least a general outline. 
Some of Guilford's more important 
contributions to personality, then, are 
as follows: 

1. He showed that the popular 
Extraversion-Introversion concept is 
a multidimensional composite rather 
than a unifactor variable. 

2. He demonstrated that personal- 
ity dimensions cannot be created by 
fiat, based on armchair thinking, but 
must be made to stand the test of 
empirical analysis. 

3. He made apparent the impor- 


tance of factor analyzing items rather 
than employing unanalyzed item 
composites in the development of 
personality dimensions. 

4. Hedeveloped the first important 
factorial test of personality. 

5. Through a long series of careful 
analyses, he has provided the field of 
psychology with 13 well established 
factors of temperament. 

6. He has pointed out the distor- 
tions involved in keying personality 
items in a test for more than one 
total score variable. 

7. He has emphasized the distor- 
tions involved in using forced-choice 
personality test items. 

8. He has formulated a tentative 
“structure of temperament” analo- 
gous to his "structure of intellect” 
in the ability field, 

9. He has presented a valuable 
master plan for effective use of factor 
analysis in personality research. 

10. He has developed several 
widely used tests in the personality 
area. 

11. He has reported the most 
definitive analysis yet carried out of 
the factorial structure of interests. 

12. He has furnished important 
results on the interrelationships 
among ability, interest, motivational, 
and temperamental variables. 

13. He has written an invaluable 
treatise on the theory and practice of 
the trait approach to personality. 

14. Last but not least, by his 
teaching and by example, Guilford 
has illuminated a path which is being 
followed more than ever today by 
serious students who wish to provide 
a sound scientific basis for personality 
measurement. 
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GUILFORD, J. P. A new revised edition of the 
Army Alpha Examination and a weighed 
scoring for three primary factors. J. appl. 
Psychol., 22, 239-246. (c) 
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Educ. Dig., 24, 49-51. (c) 

Guitrorp, J. P. Human abilities in educa- 
tion. Calif. J. instruct. Improv., 1, 3-6. (d) 

Guirronp, J. P. New frontiers of testing in 
the discovery and development of human 
talent. In, Seventh Annual Western Regional 
Conference on Testing Problems. Los An- 
geles: Educational Testing Service. Pp. 20- 
32. (e) 

GuiLronp, J. P. Psychological measurement. 
In G. H. Seward and J. P. Seward (Eds.), 
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Reasoning: Manual of instruction and inter- 
pretations. Beverly Hills, Calif.: Sheridan 
Supply. 
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COMPARATIVE PSYCHOLOGICAL STUDIES OF NEGROES 
AND WHITES IN THE UNITED STATES: 


A RECLARIFICATION 


RALPH MASON DREGER 
Jacksonville University 


Passmanick's "clarification" of Dreger and Miller's article extends the 
meaning of the original beyond its intent. In conjunction with Pasa- 
manick's 1946 article it is shown that the samples of white and Negro 
infants are inadequate. Dreger and Miller could not have been aware of 


until his 1962 clarification. In 1962 Pasamanick's “major comparison” 
between white and Negro infants does not seem to be the same as it was 
in 1946. Contrary to Pasamanick's contention, Dreger and Miller did 


not attack Geseli’s Developmental Schedules. Later work done by 
Pasamanick may substantiate his conclusions, but the criticisms of his 


1946 article still hold. 


In connection with Pasamanick’s 
“clarification” (Pasamanick, 1962) of 
our article (Dreger & Miller, 1960), I 
shall address my remarks specifically 
to the following points: (a) What 
exactly Miller and I said about Pasa- 
manick's 1946 article iz our original 
article, (b) What Pasamanick's in- 
adequacies of sampling were. (c) 
What Pasamanick's 1946 article said 
about estimating skin color. (d) 
What the major comparison between 
Negroes and whites seems to be in 
Pasamanick's 1946 article. (e) What 
the relation is between our criticisms 
of Pasamanick's 1946 article and the 
Gesell Developmental Schedules. 


ORIGINAL STATEMENT ABOUT 
PASAMANICK'S WORK 


After presenting Pasamanick's con- 
clusions that differences among his 
New Haven sample appeared to be 
more within than between racial 
groups, although Negro infants 
proved relatively accelerated in gross 
motor behavior, Miller and I wrote 
(Dreger & Miller, 1960): 

Inadequacies of sampling, as well as the more 


general difficulty of estimating skin color sub- 
jectively, tend to invalidate Pasamanick's 


Pasamanick’s 1946 reliability procedures for these were not described 


conclusions. His infants may very well have 
been comparable for other reasons than the 
quality of diet during and after pregnancy 
to which Pasamanick attributes his results 
(p. 362). 


Strictly speaking, because these are 
the only statements Miller and I 
authorized for publication, they are 
the only ones we should be called 
upon to defend. Please note that in 
our statement we did not say that 
Pasamanick’s Negro parents were un- 
representative of New Haven Ne- 
groes; neither did I say anything of 
this nature in my correspondence 
with Pasamanick. 


PASAMANICK’S INADEQUACIES OF 
SAMPLING 


The 53 Negro infants in Pasama- 
nick’s study had parents between ages 
20 and 40 (Pasamanick, 1962) whose 
schooling averaged 10.3 and 10.1 for 
fathers and mothers, respectively. 
Census figures are given in medians 
rather than means, but even with due 
allowance for some error of translit- 
eration from Census medians to Pasa- 
nick’s means, it is evident that these 
Negro families did not represent 
Negro families in the middle 1940s. 
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Pasamanick (1962) contends that 
there is an error "in lumping together 
the educational attainments of all 
Negroes 25 years of age and over, 
thereby including the older and far 
less well educated individuals." The 
actual facts are shown in Table 1. 


TABLE 1 


MEDIAN YEARS OF SCHOOLING FOR INDIVID- 
UALS Nor ATTENDING SCHOOL IN THE 
UNITED STATEs IN 1940 


Nonwhite 
Age "Total Male Female 
20 10.7 6. 7.7 
21 10.9 6.8 7.9 
22 10.9 6.6 7.7 
23 10.9 6.7 7.7 
24 10.8 6.7 TT 


Note.— United States Bureau of the Census (1947, p. 
127, Table No. 144). 

There are no comparable break- 
downs for ages 25-40, but these ages 
would a fortiori yield lower medians. 
By 1946 the picture would be very 
little different, for the GI Bill of 
Rights had not had time to take 
effect. It was not until the middle 
1950s that the nonwhite median rose 
to where it could be compared with 
the means Pasamanick reports, as is 
evident from the medians for ages 25- 
29 (see Table 2). In years of school- 
ing, Pasamanick's Negro families did 
not represent Negro families as a 
whole. Even though educational 
level may not be the most important 


TABLE 2 


TOTAL AND NoNwHITE MEDIANS 
FOR AGEs 25-29 


Date Total Nonwhite 
April 1940 10.4 7H 
April 1950 12.1 8.7 
October 1952 12.2 9.3 
March 1959 12.3 10.9 


Note.—United States Bureau of the Census (1961, p. 
109, Table No. 139). 


variable, it is one very obvious and 
very controllable variable by which 
representativeness of a sample may 
be assured. Further, to set aside 
schooling, as Pasamanick (1962) does, 
as irrelevant to the Developmental 
Quotient because later it was shown 
that children with grammar school 
educated parents did not differ sig- 
nificantly in mean scores from those 
with high school educated parents, is 
post hoc reasoning. 

White infants were even more un- 
representative than Negro infants. 
The 57 babies in Pasamanick's Group 
W1 were illegitimate infants in foster 
homes. Group W3, containing 20 
babies, were nursery school applicants 
with parents having 17.6 and 17.0 
years of schooling, respectively, for 
fathers and mothers. The 22 in 
Group W2 were illegitimate infants 
inan institution, who were, according 
to Pasamanick (1946), 

a carefully selected group limited to children 
who were normal initially or who rose to 
within the average developmental range on 
re-examination at the Clinic of Child Develop- 
ment after a period more or less prolonged of 
foster-home care. They were chosen to illus- 
trate the effect of environmental conditions 
upon infant development. [They lived in two 
large nurseries and were cared for by two or 
three nurses.] Of necessity no individualized 
care was possible, and about the only time 
they received any social stimulation was 
during a change of diapers. At such times 
some special effort was made to offer each 
child a modicum of attention. The effects of 


such environmental impoverishment are well 
described elsewhere (p. 16). 


Pasamanick (1962) set the three 
white groups aside as if they were not 
involved in the comparisons, a point 
I discuss below. He does not try to, 
and could not if he tried, defend the 
representativeness of his white in- 
fants. On one major score, then, 
years of schooling of parents, Pasa- 
manick's Negro infants did not repre- 
sent American Negro infants, and on 
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a number of scores the white infants 
described in his 1946 article do not 
represent American white infants. 
These are the “inadequacies of sam- 
pling” to which Miller and I referred. 


ESTIMATING SKIN COLOR 
In 1946 Pasamanick wrote: 


On the second visit of the subjects, the degree 
of pigmentation of the babies and their 
mothers (if brought by the mother) was esti- 
mated. They are arbitrarily divided into four 
groups: black, dark brown, light brown, and 
very light brown. In all, four mothers were 
black, 20 dark brown, and 9 light brown, 
while 21 infants were dark brown, 11 light 
brown, and two very light brown, indicating 
that the mothers as a whole are much darker 
than their children (pp. 12 f.). 

Note that Pasamanick did not in 
1946 say that at first he utilized a 
color wheel, then that two judges 
were used who were found to agree 
with each other with almost perfect 
reliability, as we are told im 1962. We 
could not have been aware of Pasa- 
manick's precautions to insure reli- 
ability, though I am glad to know 
that these were taken. The “general 
difficulty” to which Miller and I re- 
ferred may have been obviated to a 
degree by Pasamanick’s efforts. 
However, that general difficulty, 
which is involved in any comparison 
of Negroes and whites, probably still 
holds for Pasamanick’s study, as it 
does for many others, as Miller and I 
wrote (Dreger & Miller, 1960): 

One suspects that in a number of cases so- 
called racial comparisons are being carried 
out between one group designated as “white” 
and another designated as "black" which 


consists of many who are partly or even largely 
white (p. 372). 


PASAMANICK'S MAJOR COMPARISON 
In 1962 Pasamanick writes: 

The major comparison in our paper involved 

a contrast of the aforementioned representa- 

live Negro infants with the normative group 

used by Gesell for establishing his develop- 

mental norms (p. 244). 


If that is the major comparison in his 
1946 article, Pasamanick succeeded 
admirably in concealing the fact. 

There are several places in his 40 
pages of text where indeed Pasa- 
manick mentions a comparison of his 
Negro infants with Gesell's norms, as, 
for example, on pages 21-23, 28-29, 
and 30. These seem to be incidental, 
however. There is no description at 
all of Gesell's samples, while there are 
extended descriptions of the samples 
Pasamanick calls W1, W2, and W3. 
In the 15 tables and five figures, al- 
though there are implied Gesell 
norms, there are several direct statis- 
tical descriptions of the white sam- 
ples which appear to be the major 
comparison groups for the Negro 
infants. 

That Pasamanick himself did not 

consider his major comparison Was 
between Gesell's norms and his Negro 
sample is demonstrated by the first 
sentence of his Conclusions (Pasa- 
manick, 1946): 
The development of a group of 53 Negro in- 
fants (28 male, 25 female) living with their 
own families, was studied and compared 
with that of three groups of white infants; 
Group W1, 57 infants (25 male, 32 female), 
who were living in boarding homes; Group 
W2, 22 infants (7 males, 15 females) living 
in a child-caring institution; and Group W3, 
20 infants (8 males, 12 female) from superior 
families and living at home (p. 41). 


If, however, appearances to the 
contrary notwithstanding, the major 
comparison Pasamanick made in his 
1946 article was between the Gesell 
norms and his “representative Negro 
infants,” and if, as Pasamanick con- 
tends, the Gesell norms arise from rep- 
resentative samples of white infants, 
then it was even more incumbent 
upon Pasamanick to do what he did 
not do, that is, show that his Negro 
infants represented all Negro infants 
in the country, not merely that they 
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were a sample of New Haven's popu- 
lation. 


Our CRITICISMS AND GESELL's 
ScHEDULES 


It scarcely seems necessary that I 

should have to discuss the Gesell De- 
velopmental Schedules when Miller 
and I raised no question whatsoever 
about them in our original article. In 
fact, we stated specifically (Dreger & 
Miller, 1960): 
Tests of infant intelligence have been suspect 
(possibly unjustifiably) for many years as 
predictors of subsequent intellectual perform- 
ance. Nevertheless, as measures of compara- 
tive psychomotor developmental level, they 
perform a useful task (p. 362). 


I wrote to Pasamanick: 


I myself have employed Gesell's scales fairly 
extensively, but find myself leaning toward 
[Psyche] Cattell’s scale. The fact that the 
latter yields only a single score is a handicap. 
But I am not at all sure what the four areas 
of Gesell's tests mean, so the seemingly greater 
information provided is only qualitative as 
far as I am concerned. A factor analysis of 
Gesell's scales would, I am fairly certain, dis- 
close different dimensions. 


Ball (1961) has demonstrated my 
thesis by revealing from 6 to 12 fac- 
tors on the Gesell, depending upon 
age level. Tentative results suggest 
that these dimensions are not the 
same as Gesell's logical categories. 

Pasamanick takes exception to the 
statement (from a letter) that I had 
heard that Gesell's norms were de- 
rived from upper classes. I cannot 
defend this clinical gossip. I referred 
to it in a letter, not in anything I in- 
tended for publication. To be sure, 
there is no description of the norma- 
tive samples in Gesell's Developmental 
Diagnosis (Gesell & Amatruda, 1947). 
Pasamanick, who should know, states 
that they came from a heterogeneous 
group. It pleases me to know that the 
schedules which I have used for a 
number of years, in spite of the ap- 


parently unjustified criticism, and 
have regarded as excellent samples of 
infant behavior in general, are in 
reality representative. That I have 
not accepted as dimensionally sound 
Gesell's four categories derived by 
logical analysis has nothing to do 
with the usefulness of his schedules 
for describing the behavior of infants. 


CONCLUSION 


Work by Pasamanick done later 
than our review period may or may 
not substantiate his 1946 study. I 
deeply regret that one article of his 
(Knobloch & Pasamanick, 1958) did 
not reach Miller's and my attention. 
It came at the very end, December 
1958, of our review period and was 
not referred to in the abstracts we 
used as guides to the literature. It 
seems to support Pasamanick's basic 
conclusion of greater differences with- 
in than between racial groups on the 
Developmental Schedules, for racial 
groups as loosely defined. Since it ap- 
pears to corroborate my egalitarian 
biasses, in any future review I should 
subject this article to especially 
searching analysis, for I have learned 
to be especially critical of that which 
shores up my own prejudices. This 
was the way we treated Pasamanick's 
1946 article—and found it wanting. 
With the information available to us 
Miller and I were not at all “a bit pre- 
mature in disposing of” Pasamanick's 
work. 
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"PATTERN RECOGNITION" COMPUTERS AS MODELS 
FOR FORM PERCEPTION? 


LEONARD UHR: 
University of Michigan 


tions are compared with models and suggestions in the psychological 


literature. 


A surprising number of programs 
and proposals for computers that, in at 
least certain senses of the phrase, 
"perceive forms" have been pub- 
lished in the past 5 years (for partial 
reviews, see Minot, 1959; Steinbuch, 
1958; Stevens, 1957; Uhr, 1960). 
This paper will attempt to demon- 
strate that this work is of theoretical 
interest to the behavioral scientist; 
that it offers testable models for per- 
ception, neural organization, and con- 
cept formation similar in kind to those 
that have been discussed in the bio- 
logical and psychological literature. 
Attneave and Arnoult (1956) ex- 
amined some of the earliest of this 
work for its pertinence to a psychol- 
ogy of form perception; but by far the 
largest and most interesting body of 
research has been done since 1957. 
Attneave and Arnoult (1956) noted, 
in the context of their general review 
of empirical pattern perception re- 
research, “It seems extraordinary, 
therefore, that so little progress has 
been made (and, indeed, that so little 
effort has been expended) toward the 
systematizing and quantifying of such 
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factors [of form perception]" (p. 452). 
This review will present results, 
largely by nonpsychologists, that, it 
is contended, have come a long way 
toward laying this groundwork. 


COMPUTER SIMULATIONS AND 
P SYCHOLOGICAL THEORIES 


Problem of Form Perception 


The problem posed the programer 
or designer of a computer for “‘pat- 
tern recognition" (or "character rec- 
ognition," as it is sometimes termed 
when a specific set of predetermined 
patterns, usually the alphanumeric 
Symbols as printed in a special type 
font, is the only set to be processed) 
is the many-to-one mapping of dif- 
ferent inputs into appropriate output 
Sets. These output sets are the sets 
into which we, as human perceivers, 
map the inputs. They group things 
across an unknown set of geometric 
transformations and deformations; 
and the problem of pattern recogni- 
tion is to discover the operations that 
will effect this mapping, or to pro- 
gram a computer that will itself dis- 
cover these operations. They may or 
may not be the operations that 
people use. If they are not, they will 
still be functionally equivalent to 
people's operations, since the per- 
ceptual mechanism of the human 
being has posed the problem. At the 
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least, then, a successful pattern rec- 
ognition computer will be a func- 
tional equivalent to the human sys- 
tem. But it will almost certainly be 
more than that, for its inner mech- 
anism will be interpretable in terms 
of subfunctions, and in terms of 
neural networks. And the functions 
that will have evolved in the success- 
ful computer should have some rela- 
tions to the functions that have 
evolved in the brain. In addition, 
most computer programs embody 
hidden psychological theories or 
hunches that their logic is designed to 
embody. It should be to the psy- 
chologist’s interest to try to 
strengthen these in two ways: first, 
by developing better and broader 
theories, and second, by suggesting 
pertinent data and experiments by 
which these theories can be tested, 
developed, modified, and rejected. 
The problem of giving the ap- 
propriate name or response to an in- 
put stimulus is far from being an 
unimportant or peripheral problem 
to the psychology of perception; yet 
virtually nothing in the way of a 
theory as to how this is done by hu- 
man beings seems to exist. Nor is 
there anything approaching uni- 
versal agreement that it can be 
solved trivially. Thus Vernon (1952) 
states: 
The last essential stage of the perceptual proc- 
ess then is that of identification and under- 
standing of meaning... a process of enor- 
mous complexity and extreme difficulty of 
analysis. Such an analysis is at present im- 


possible by the methods of experimental psy- 
chology (pp. 22-23). 


This is the problem toward which 
most research in psychophysics and 
perception would seem to be directed, 
but this research has posed itself a 
set of auxiliary problems chiefly re- 
lating to the dynamics of the proc- 
esses that go into building the com- 


pleted form to be mapped. At this 
point, however, the central question 
must again be raised—given a suc- 
cessfully preprocessed pattern, 
clearly sensed by the perceiver, how 
is the perceptual act, the deciding 
upon its meaning (the giving of a 
name or making of a response) ef- 
fected? From another experimental 
approach, much learning and con- 
cept-formation experimentation ex- 
amines ways in which mapping oper- 
ations can be built up and modified, 
and asks questions about output sets, 
in the form of stimulus generalization 
and equivalence. But, again, this has 
not led to anything like a theory of 
form perception. 

Rashevsky (1948) and Nieder 
(1960) have suggested the beginnings 
of a neural net theory as applied to 
pattern perception. Marshall and 
Talbott (1942), Osgood and Heyer 
(1952), and Day (1956) have, all 
working in the same direction, de- 
veloped what is probably the closest 
thing to a theory. Schade (1956) has 
constructed a photoelectric analog 
model of a visual system. But these 
have chiefly been used to examine 
the question of how a precise, clear 
pattern of contours is presented to 
the cortical processing mechanism 
over the range of distorting influences 
(for example, poor illumination, ny- 
stagmus, retinal irregularities) and 
rigid transformations that have so 
little effect. Pitts and McCulloch 
(1947) and Culbertson (1950) have 
suggested mechanisms for mapping 
inputs from this point on; but their 
suggestions seem to be, essentially, 
brute force approaches and empty in 
the sense that they say something 
like “the problem is one of finding the 
appropriate partitionings of a finite 
set; here is a way to burrow through 
the set in an exhaustive manner." 
Deutsch (1955, 1960), Sutherland 
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(1959), and Hassenstein (1959) prob- 
ably come closest to presenting de- 
tailed theories of form perception. 
These are all for lower organisms, 
and, as we shall see, are all extremely 
similar to pattern recognition pro- 
grams for computers. 


Methods of Computer Simulation 


A “general purpose digital com- 
puter" is usually the vehicle for ex- 
periments toward building pattern- 
recognition devices. A program, 
which is a sequence of instructions 
(commands directing the computer to 
perform simple logical operations), 
converts any computer of this sort 
into a simulation (to—theoretically— 
any desired degree of accuracy) of 
any desired special-purpose analog 
or digital computer (including a 
neural net or a brain). So it is a con- 
ceptually trivial matter whether a 
specific proposal has been tested out 
by Programing a computer, or by, ef- 
fectively, wiring in the program 
through the building of specially de- 
signed logical circuitry, or by build- 
ing some sort of an analog machine. 
Nor is it important whether the ma- 
chine is electrical, mechanical, or 
optical. The machine itself is not 
the model; its program transforms it 
into an information processor that is, 
The thing of fundamental interest 
might best be something charac- 
terized as a “calculus of recognition,” 
of which human form perception is a 
specific instance. The method of 
realization is thus trivial in the same 
sense as the question of whether 
numbers have been added by tally- 
ing fingers, manipulating an abacus, 
punching a Friden, or charging a 
capacitor. The calculus embodies a 
theory; the program or other method 
for implementing provides a com- 
putation, or evaluation, of the theory. 
The importance of the computer lies 
chiefly in its size and speed—its abil- 


ity to model complex theories. For 
convenience of exposition, the word 
"program" will be used to designate 
pattern recognizers, whether pro- 
gramed digital or wire-in analog 
computers. 


Computer Programs as Models 


These programs, then, can be con- 
sidered working models for the func- 
tions that they perform, and for 
some other computer (for example, 
the brain) that performs these func- 
tions. They can be considered (and 
conceived) as models at the molecu- 
lar level of neurons, or at the level of 
behavior, function, or psychological 
system. They are, however, models 
in every sense of the concept "model" 
of the animal and human brain. 
They are models (or embodied 
theories) in exactly the same way as 
our traditional neurophysiological 
and psychological models, for higher 
mental processes, and particularly 
for form perception. The range of 
phenomena that Day's (1956), Suth- 
erland's (1959), Deutsch's (1960), 
Nieder's (1960), or Platt's (1960) 
model covers is relatively restricted, 
almost certainly not as great as the 
range covered by a pattern recogni- 
tion program. (For example, Nieder 
Suggests integrating over distances 
between all pairs of points in the pat- 
tern, and shows that outputs will be 
invariant over rigid transformations.) 
Nor are the number of anchor points 
at which the theory is founded in 
data, or the functional relations the 
theory suggests between data, or the 
further functional relations and new 
data the theory predicts richer than 
for the computer Programs. Nor is it 
even clear that the theories designed 
to talk about the physiological sys- 
tem are more consistent with all 
known data, or less farfetched in 
making uncontroverted but never- 
theless somewhat unlikely assump- 
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tions, despite the fact that the com- 
puter programs have rarely been de- 
signed with the physiological model 
in mind. 

It is true that on the surface the 
computer program may appear com- 
pletely different from the animal. 
But this is usually due to irrelevant 
differences of the sort mentioned 
above. Digital computers are se- 
quential realizations of functions 
that can often be performed more ef- 
ficiently by parallel operations; and 
it is their parallel counterparts, or 
isomorphs, that we should be com- 
paring to the brain. But the same 
program—that is, the same sequence 
of operations—that is performed by 
an analog computer can be performed 
by the general purpose digital com- 
puter, and, as of today, with far 
greater economy and flexibility. 

The computer models, in fact, 
seem superior, in important respects, 
to our traditional models. First of 
all, they have been specified pre- 
cisely—this is not at all trivial; re- 
cent attempts to put such theories as 
Hebb’s and Hull's into precise, test- 
able form have met with major dif- 
ficulties (Dunham, 1957, Rochester, 
Holland, Haibt, & Duda, 1956). 
Second, they have actually been 
examined for their consequences 
(through the running of the program) 
and tested as to their success. Nor 
is this testing to see how well the 
program can recognize the letters of 
the alphabet a trivial one. For no 
psychological theory could hope to 
pass such a test (nor does there ap- 
pear to be any theory that could even 
be tested). The relatively precise 
theories, such as Day’s (1956), Has- 
senstein’s (1959), or Sutherland's 
(1959), can account for only a small 
part of the process. The attempts to 
discuss the entire process (Hebb, 
1949; Kohler, 1951) are suggestions 
and not theories at all. 
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The suggestion, then, is that a 
number of the following programs 
could be tested, in the way theories 
are tested, on a wide variety of new 
problems for which they were not de- 
signed; and the contention is that at 
least some of these programs would 
often exhibit surprising success, For 
example, any of the programs that 
possess even rudimentary abilities to 
become modified as a function of the 
inputs presented to them should show 
curves that will look similar to learn- 
ing, confusion, and forgetting, as à 
function of scheduling of the experi- 
ences given them, Those ms 
with greater possibilities for self- 
modification (such as those that at- 
tempt to optimize the sets of meas- 
urements that they make) should be 
able to respond to an even wider 
variety of changing conditions, and 
to process an even greater variety of 
patterns (such as many new pattern 
sets, including "meaningless" pat- 
terns, faces, line drawings, speech). 
Even the completely preprogramed 
proposals should often have some 
abilities with alphabets with which 
they had not been designed. On the 
other hand, the very question of how 
to go about subjecting these pro- 
grams to adequate tests has scarcely 
been posed, and very few satisfactory 
experimental tests have actually been 
made. The establishment of ade- 
quate criteria that such programs 
must meet if they are to hope to be 
taken seriously should be very similar 
to the establishment of adequate con- 
ceptual and experimental methods 
for form perception. The extension 
of these programs, so that they can 
begin to handle phenomena such as 
inputs of graded intensities, and in- 
puts that endure and change over 
time should throw enormous light on 
their powers and weaknesses as 
theories. But, once again, it is not at 
all obvious that these programs are 
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incapable of handling such exten- 
sions. 

It is not the purpose of this paper 
to review physiological and psycho- 
logical theorizing as to form percep- 
tion. But, in order to place the com- 
puter programs that will be dis- 
cussed in a proper perspective, rel- 
evant related theories will be men- 
tioned when appropriate. 


PATTERN RECOGNITION PROGRAMS 


The different methods proposed 
for pattern recognition have usually 
been classified by terms such as 
"template matching" versus “an- 
alytic" versus "random" (Minsky, 
1959; Rosenblatt, 1960a, 1960b). 
They have further been contrasted in 
their abilities to "self-organize" or 
"learn" or “adapt.” A computer 
simulation program that has re- 
cently been described (Bledsoe & 
Browning, 1959), along with the con- 
ceptualization of a general type of 
computer developed by Uttley 
(1959), seem to this writer to suggest 
à different approach to organizing 
and comparing the different methods 
Proposed and, further, an approach 
that suggests interesting similarities 
between superficially different com- 
puter methods and between com- 
puter methods and neural nets. 

To simplify exposition, all per- 
ception recognition methods except 
for a few analog machines will be dis- 
cussed as though a matrix of cells— 
whether photocells or retinal cones— 
performs the first detection step in 
the process. This appears to be the 
case for living visual sensing organs. 
Similarly, most machines use aninput 
"matrix" or "raster," And, in any 
case, there is a basic grain of “just 
noticeable differences" in both spatial 
and quantitative dimensions. 

Almost all programs have been 
tested on letters of the alphabet. This 
is extremely unfortunate, for it ob- 


scures the fact that letters are a far 
from trivial or restricted set of out- 
line patterns of forms. Consider as 
test cases a small set of the typical 
stimuli that we might like a perceiver 
to recognize; for example, the letters 
A, B, C, and line drawings of a face, 
of a table, of a smile. The best ex- 
amples of the alternate methods will 
go about this job as follows: 


Template Matching 


A template matching method will 
use a template for each of the pat- 
terns (or, to the extent this can be 
tolerated, a template for each shape 
variant of each pattern). The stimu- 
lus to be recognized will be held up in 
front of the template, and some sort 
of matching method will be used to 
assess the percentage of area they 
have in common. The stimulus can 
be translated in two (or even three) 
dimensions in relation to the tem- 
plate; it can be rotated, or even 
magnified. The position and tem- 
plate of highest match will determine 
the recognition response. (Good ex- 
amples are Anonymous, 1960a, 1961; 
Fitzmaurice, Sabbagh, & Elliott, 
1959; Rabinow, 1957; fora variety of 
methods see Broido, 1958.) 

This method js obviously quite 
powerful under favorable condi- 
tions—when all members of a set of 
stimuli (for example, all A's that the 
machine must recognize) are exactly 
the same, are known in advance so 
that the template can be cut, and are 
exactly positioned or positionable, 
But the method fails miserably as a 
result of what most people would 
intuitively call, and to the human eye 
would certainly be, trivial displace- 
ments. Ad hoc methods can be de- 
vised to take care of many of these 
Cases as they arise, but then the 
device quickly loses any theoretical 
interest or even practical feasibility. 

Template matching does not de- 
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pend upon logical circuitry of the sort 
made available by computers and, as 
a result, this method has a long his- 
tory of optical-mechanical gadgets 
dating back at least to the turn of the 
century. Nor are these optical ma- 
chines trivial, for the methods and 
machines developed by Rabinow and 
by Baird-Atomic, even though they 
are purely mechanical and optical 
and do not employ computers, are 
among the most advanced and suc- 
cessful of the template matching 
group (see Anonymous, 1961). And, 
in fact, they compare quite favorably 
at the present time with pattern 
recognition methods of far greater 
sophistication.  Rabinow's method 
employs templates for simple geo- 
metric configurations (such as 
straight lines, curves, and angles) 
that are subparts of the characters, 
and recognizes a character by the 
combination of templates matched. 
This type of template matching is, in 
fact, almost identical conceptually to 
some of the very powerful analytic 
approaches to be described later, and 
indicates some of the confusing simi- 
larities that exist between superfi- 
cially different methods. Template 
matching is also already disturbingly 
similar to functions western philoso- 
phers and psychologists have fre- 
quently dignified with terms such as 
"ideas," "images," “ideals.” 


Primitive Analytic Methods 


A number of character recognition 
machines have been built in the past 
few years, for use by banks and sales 
offices, to recognize the digits and, 
occasionally, the letters of the alpha- 
bet. All of these machines were 
designed to recognize specially pre- 
pared symbols—printed in special 
optical or magnetic inks with type 
fonts specially designed to give maxi- 
mal differentiations by the recogni- 
tion logic used. They embody a wide 


variety of theories of form perception, 
But the problem posed by these 
machines has been so severely re- 
stricted that it is difficult to judge, 
simply from how successful the 
machine may be in its limited do- 
main, what potential powers it might 
have for more general form percep- 
tion. Success can just as well reflect 
upon good choice and preparation of 
inputs as upon the recognition meth- 
ods. And, in fact, the most successful 
machines of this sort are still punched 
IBM cards and magnetic tape. For 
the machine that can recognize only 
specially prepared inputs of a known 
array is conceptually identical to 
these. And, worse, it is of question- 
able additional practical worth, since 
only material that is originated by 
the machine can be recognized—so 
that this material might just as well 
have been prepared in the more reli- 
able old-fashioned way. 

These machines usually employ 
logics for analyzing inputs with re- 
spect to certain qualities that make 
use of weak analytic methods very 
close to, and easily understood in the 
context of, template matching. For 
example, one machine looks at a 
simple Boolean function of three or 
four critical cells in a photoelectric 
input matrix (Bailey & Norrie, 1957), 
and chooses the character that satis- 
fies the function. A Japanese re- 
search program is employing similar 
logic (Wada, Takahashi, Iijima, 
Okumuru, & Imoto, 1960). Stearns 
(1960) has built a similar machine. 
Rabinow (1957) has realized a similar 
logic in an optical-mechanical ma- 
chine. Others divide the matrix into 
four or eight parts (Anonymous, 
1960b, Shepard, Bargh, & Heasley, 
1957), or into columns (Eldredge, 
Kamphoefner, & Wendt, 1956, 1957), 
or into rows, or into both, and inte- 
grate filled areas within each of these 
parts. Others count the intersections 
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between figures and particular line 
configurations (Dimond, 1957). 

In each case, the areas or lines 
chosen on the input matrix can be 
considered templates. The require- 
ment of crossing a line can be con- 
sidered one of partial template match- 
ing (at least one cell of the template 
overlapping the input figure, rather 
than all cells, or most cells, as the 
cutoff point). A closely related cur- 
rent psyhcological theory is Suther- 
land's (1959) model for form percep- 
tion in the octopus, as depending on 
the pattern’s vertical and horizontal 
extent. Deutsch (1960) has put for- 
ward an opposing theory of the same 
type to explain Sutherland’s data, 
essentially a suggestion of a retinal 
net to measure distance, 

Dickinson (1960) presents results 
of scanning a character with a narrow 
slit in the way done by Eldredge et al. 
(1957), including the use of this 
method to design a type font of char- 
acters that will be optimally dis- 
criminated. On the basis of extensive 
experimentation, and theoretical con- 
ceptualizations of the problem in 
terms of the amount of separation 
between patterns possible in an n- 
dimensional space defined by the 
recognition operations employed and 
also in terms of Statistical decision 
theory, he concludes that the exten- 
sion of this system to recognition 
across the range of alphanumeric 
patterns easily recognized by human 
beings does not appear feasible. 

Wada et al. (1960) and Evey (1959) 
have made similar analyses for single 
cells in the matrix, and Baran and 
Estrin (1960) have implicitly made 
such an analysis in the “learning” 
that their program accumulates with 
experience. Highleyman and Ka- 
mentsky (1960) and Horwitz and 
Shelton (1961) have used similar 
methods for examining and correlat- 
ing individual cells. 


A relatively early program (Grean- 
ias, Hoppel, Kloomok, & Osborne, 
1957) divided the matrix into vertical 
strips, or columns, and made counts 
of filled cells within these columns, 
much as is done by some of the 
machines. It also made some assess- 
ment of relations between topmost 
points in different columns giving 
something weakly related to slope. 
Kamentsky and others at Bell Labs 
(Highleyman & Kamentsky, 1959; 
Kamentsky, 1959) have programed 
computers and built equipment to 
identify angles and closed loops, from 
which, presumably, a recognition 
logic could be built. These, now, give 
the beginnings of geometric and topo- 
logical features; but they might be 
considered as templates translated 
across parts of the matrix. 

ovasznay and Joseph (1955), 
Selfridge (1955), Dineen (1955), and 
Kirsch, Cahn, Ray, and Urban (1957) 
have experimented with primitive 
operations that a computer might 
use. Their primitives seem interest- 
ing, possibly more interesting as psy- 
chological simulations than most of 
the other work discussed so far, They 
Seem quite similar to some of the 
more concrete and specific psycho- 


1947; Marshall & Talbott, 1942). 
The abilities of these primitive opera- 
tions when tested, and the very pos- 
sibility of testing them, should be 
exciting in their implications to the 
psychologist. These Primitives are 
such things as “edging” (turning 
areas into contours), averaging, find- 
ing connectivities (delimiting figures 
as opposed to ground), and counting 
(computing area and number of ob- 
jects), Apparently, at least in some 
of these projects, unpublished work 
was continued in trying to build up 
a series of these Operations to map 
inputs into their Proper classes and 
thus perform the complete pattern 
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recognition function but, apparently 
(judging chiefly from the lack of 
published results), with little success 
(Minsky, 1959). 

Licklider (1960) has briefly men- 
tioned a simple method devised by 
J. I. Elkind to recognize alphanu- 
meric characters by categorizing 
slopes into three types and counting 
the number of items in each with a 
surprising accuracy of 85%. 

Booth (1956) has described a 
method that makes use of standard 
scanning to count the number of 
intersections between the input pat- 
tern and the lines described by the 
scanner. Using what he terms the 
“principle of digital feedback,” the 
machines will then apply additional 
scanning patterns until any near ties 
are broken. This is the equivalent 
of following a serial tree (or neural 
net) of operations, applying each 
only as a function of results up to 
that point. 

Singer (1960, 1961) gets character- 
izations of curves by measuring dis- 
tance from the center of gravity along 
various radians. Delay lines of the 
sort that they suggest might operate 
in the retina or cortex, according to 
their model of neural conduction, and 
would convert these spatial geometric 
qualities into temporal patterns. 
This approach seems quite similar to 
Deutsch's (1956) earlier theory. Har- 
mon (1960) has built a machine 
that makes a dilating circular scan of 
a centered figure, discriminating be- 
tween simple patterns such as 
circles, squares, and hexagons. These 
conversions of spatial distance into 
time make use of mechanisms very 
similar to those hypothesized by 
Deutsch (1956) and  Rashevsky 
(1948). Similar operators are used, 
along with additional methods of 
analysis, in some of the more sophisti- 
cated methods, such as Unger’s 
(1958, 1959), discussed below. 


Several groups have begun to ex- 
plore nets of neuron-like elements as 
pattern recognizers. Loebner (1960) 
has developed a simple “optoelec- 
tronic” light amplifier device that 
simulates simple neural nets. This 
work has been based upon an analysis 
of the anatomy of the retina and of 
the operators (straight line and sharp 
curve detectors) that have been 
demonstrated by Lettvin, Maturana, 
McCulloch, and Pitts (1959) to be 
found in the visual tract of the frog. 
The basic operation of this net is 
differencing between vertical and 
horizontal pairs of elements. Ka- 
mentsky (1959) has experimented 
with a matrix of neuron-like elements 
that sums positive values across 
spatially related neighbors, and can 
also sum negative values from the ele- 
ment directly in front, which gives 
something of an "on" and "off" 
effect. Change of threshold levels in 
both these methods leads to differing 
sensitivity to different characteris- 
tics of patterns. These methods and 
results are quite similar to some of the 
earlier simulation work done by 
Selfridge (1955) and Dineen (1955) 
and by Kirsch et al. (1957). W. A. 
van Bergeijk and Harmon (1960) 
have built a small net that is capable 
of identifying curved as opposed to 
straight lines under certain condi- 
tions. Uhr and Vossler (1961c) have 
proposed the use of several layers of 
on and off elements that would com- 
pose into the equivalent of “‘ex- 
clusive or” differencing and arith- 
metic sum averaging nets. Much of 
this work is related to Rashevsky's 
development of gestalt-perceiving 
nets. 

Several other groups (e.g, Bab- 
cock, 1960, 1961; Crichton & Holland, 
1959; Farley & Clark, 1960; Minsky 
& Selfridge, 1960; von Foerster, 1960; 
Willis, 1960) are exploring the proper- 
ties of neural nets with occasional 
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application to pattern recognition 
problems. Frankel ( 1955) has made 
interesting suggestions for simula- 
tions. 

Sprick and Ganzhorn (1960) have 
used an analog curve follower that 
operates on the outside contours of 
forms to identify numerals, with 
considerable success across degrada- 
tions and distortions in the figure. 
Joseph and Feller (1958) have ex- 
plored the use of an analog curve 
follower that generates sets of num- 
bers that are functions of harmonics 
of the closed, smooth curves it is 
following. This method was shown 
to give different outputs for different 
contour shapes, such as B.58 jets 
with and without engines. Maxwell 
(1960) has briefly reported a curve 
following technique under develop- 
ment that recognizes motion in 
horizontal and vertical directions. 


Sophisticated Analytic Methods 


Programs that make use of sets of 
powerful individual operators. The 
first group of programs that can be 
taken seriously as working models of 
Perception is the one that is typically 
called "analytical" It is probably 
best exemplified by the recent work 
at the Massachusetts Institute of 
Technology under the direction of 
Selfridge and Doyle (Doyle, 1960; 
Selfridge & Neisser, 1960), and by the 
work of Unger (1958, 1959) and of 
Bomba (1959) at Bell Labs. Shepard 
(1959) has proposed a program of this 
sort that makes use of learning 
principles as developed in psychology. 
In all of these programs, a large num- 
ber of separate analytic operations 
are performed on the input—opera- 
tions of the sort “is there a left con- 
cavity?" “is there an upper hori- 
zontal?” “how many crossings does 
the pattern make with various lines?" 
"is there a closed loop?" These, in 


fact, remain much the same as the 
operations previously discussed. 
They differ chiefly in two respects: 
relatively more powerful operators— 
in the sense that a curve detector is 
more powerful than a dot detector — 
are used; and these operators are 
satisfied not only by a rigidly placed 
input, but also by inputs over certain 
displacements, such as any curve 
anywhere within a specified section of 
the matrix. But this again is identical 
to a subpart template that can be 
wobbled or translated across at least 
some portion of the input and also 
stretched a bit. Thus it remains 
Similar, though with certain impor- 
tant differences, to the translation 
and rotation of a rigid template, from 
which perfect match is not demanded, 
across restricted portions of the 
matrix. 

The use of templates for parts of 
figures even has advantages over 
these schemes in that information 
about relative positions between tem- 
plates can be retained. To this ex- 
tent, these analytic methods are 
discarding what would seem to. be 
important topological and geometri- 
cal information, in that they are us- 
ing what still turn out to be relatively 
local and disconnected operations. - 

Selfridge and Neisser (1960) and 
Rosenblatt (1960a) have emphasized 
the virtues of parallel processing, 
such as is done by the Pandemonium 
and the Perceptron, as contrasted 
with serial decisions such as proposed 
by Booth (1956) and programed by 
Unger (1958, 1959), Selfridge (1959) 
has pointed out that a serial method 
is unrealistic in that it necessitates 
greater computational depth, and, in 
ways, is difficult to program or realize 
in machines. On the other hand, to 
the extent that it can be programed 
efficiently, it is less redundant. Nor 
is it necessarily unrealistic, for there 
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is good evidence that human identi- 
fication of poorly perceived objects 
improves over time and that some 
sort of sequential decision processing 
may well be taking place. In any 
case, the differences between parallel 
and serial application of operators do 
not seem to be fundamental concep- 
tual differences, but rather raise 
practical questions of efficiency, reli- 
ability, and feasibility of exactly the 
same sort that confront the experi- 
menter who must decide whether to 
collect his data first and then decide, 
or to collect each piece on the basis of 
information gained to that point; 
whether to ask 20 questions before 
getting any answers, or to ask them 
in the usual sequential manner. The 
sequential process is bound to do 
better; but very often (as, almost 
certainly, in the case of present-day 
computer simulations) it is so ineffi- 
cient in application as to be more 
cumbersome in the practical situa- 
tion. And the parallel method can 
»also be rationalized as, in its redun- 
dancy, serving useful error correcting 
functions. 

Uhr and Vossler (1961a) have pro- 
gramed a computer to generate its 
‘Own operators within a small '"*opera- 
tor matrix" by both generating 
random (connected or unconnected) 
n-tuples within the matrix, and ex- 
tracting random parts of an unknown 
input. The program then translates 
these operators across the large,input 
matrix and accumulates information 
about their hits. It compares these 
operator output lists with lists pre- 
viously writtenin memory and chooses 
a list that gives the best result in 
satisfying a similarity criterion. It 
continually rewrites these memory 
lists, adjusting their values on the 
basis of each success or failure, ad- 
justing their weightings for their 
usefulness, and deciding to throw 


away worthless operators and to 
generate new ones. The translation 
method has a quite simple and plaus- 
ible neural analog, in that neural 
network of reiterated 5X5 nets that 
would perform essentially the same 
set of operations in parallel. Thus, 
instead of the on-off differencing net 
that Uhr and Vossler (1961c) have 
suggested as a single basic operator, 
or the single cell for which probabil- 
ities are computed in Uttley's (1956) 
computer, or the straight line and 
angle responders that Lettvin et al. 
(1959) have discovered and tenta- 
tively identified as demon-like opera- 
tors in the frog, each of these 
operators suggests a simple neural 
network that will respond to a rela- 
tively local geometric characteristic 
and, because it is repeatedly present 
in parallel throughout the retina, will 
respond to this no matter where it 
occurs spatially. Since the operator 
is itself a small (5X5) matrix, it in 
fact defines a small retinal matrix of 
similar design. This neural inter- 
pretation would seem to open pos- 
sibilities of making suggestions, from 
neurophysiology, as to which the 
most plausible operators, and types 
of operators, might be. 

The Pandemonium program and 
similar programs, to the extent that 
they employ parallel operators that 
range across subparts of the input 
matrix, are doing something rather 
similar. But it seems more plausible 
physiologically to have an operator 
uniformly distributed across the ma- 
trix (which, when stimulated on 
a sequentially operating machine 
becomes uniform translation across 
the matrix) than to specify that it be 
distributed in only certain portions 
of the matrix. In general, however, 
most of these analytic operators have 
neural net or functional interpreta- 
tions. In our present state of igno- 


logical and psychological 
ments, as in the striking instance of 
Selíridge's operators directing Lett- 
vin, Maturana, McCulloch, and 
Pitts’ (1959) successful search for 
straight line and angle operators in 
the frog. Thus bm 
programs may on verge 

becoming important theoretical tools. 
It would be unfortunate if psycholo- 


interrelated pattern. The most power- 
ful of the analytic programs retain 
what would seem to be the greatest 
tmi veio D pertinent information by 
analyzing connectivity and topo- 
logical of patterns. These 


the topological 
features of figures. The success of 
this program without the addition of 
geometric operators is known to be 
limited, for, clearly, topology is often 


which operators that identify neces- 
sary geometric c istics can be 
built. Minsky (1959) has discussed 
programs of this sort and Hodes 
(1961) has written a program to 
identify embedded patterns. Grims- 
dale, Sumner, Tunis, and Kilburn 
(1959) have written and tested a 
program that follows a connected 
pattern and extracts and orders this 
sort of topological and geometric in- 
formation with surprising success. 
This program would seem to offer 


promise across wide distortions. 
Thr (1999) has suggested a similar 
approach, with certain simplifica- 
tions based on physiological and 
psychological evidence as to the hu- 
man's mechanism, de- 
signed to increase the power of the 
geometric operators. Kazmierczak 
(1960) has shown how a great deal of 
this information can be extracted by 
the simple analog method of a resistor 
network. Transformers embedded in 
this network will now be sensitive to 
straight lines and curves and to their 
orientations. A great deal of averag- 
ing and smoothing is also accom- 
plished. 


For all of these the problem re- 
mains that of mapping the many 
variants of figures into the relatively 
few appropriate classes, They do not 
seem to need much precision in 
analyzing detail. For clearly, the 
retention of all, or even too much, 
geometric detail will lead to over- 
— false differentiations. And, at 
t 
this mapping procedure remains an 
intuitive procedure (of either pro- 
gram or programer) of choosing 
operators sensitive to the important 
information and the invariant infor- 
mation. 

This approach treats the figure like 
à complicated curve and tries to de- 
scribe its important c! istics in 
à relatively simple, and manageable, 
manner. It strives for a logical 
analysis of the meaningful, informa- 
tion bearing, aspects of the pattern. 
It would seem to come the closest of 
any of the methods that discretize the 
matrix to making what we might 
term a gestalt analysis of the pattern. 
aon it Ma clear what more we 
might desire from a gestalt analysis. 
For, in characterizing single elements, 
this program must do things like 
identify connectivities, compute local 
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curvatures and average curvatures, 
and locate points of discontinuity 
(angles). In describing and then 
assessing relations between these 
relatively local elements it once agaia 
places these elements into their 
larger context, this time the total 
pattern, This method also seems to 
come the closest to giving a decom- 
position of the input pattern of the 
sort that is "natural" in the sense 
that people decompose and describe 
patterns in much this way—in terms 
oí their slopes and curves, their 
topological features, and the separate 
simpler patterns into which they 
break as a result of discontinuities 
(angles and sudden changes in curva- 
ture). These then, would 
seem to embody a functional theory 
of form perception that is supported 
by a good bit of psychophysical data 
as to discriminability of simple 
shapes, as well as by introspection. 
Since the basic underlying operation 
is differencing (as in difference equa- 
tions to compute slope and curva- 
ture), it would seem to be interpret- 
able in on-off neural nets (Rashevsky, 
1948), 

A similar approach is being taken 
in current proposals for programs to 
recognize handwriting (Eden & Halle, 
1960; Harmon & Frishkopf, 1960). 
These programs are being written to 
analyze and decompose the continu- 
ous curve of handwriting into a con- 
catenated set of basic elements from 
a small phonemic-like alphabet of 
such things as long and short, right- 
curved and left-curved, strokes. 
Bledsoe and Browning (1959) tested 
their program (to be discussed below) 
on handwriting inputs, with a moder- 
ate amount of success, ranging be- 
tween 30% and 50% under various 
conditions. They also introduced a 
method for using simple contextual 
information (a word list stored in 
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MATHEMATICAL ANALYSES OF 
PATTERN RECOGNITION 


A different approach has 
taken by a number of people who 


choosing, or discovering, the 
operators for the appropriate many- 
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one mappings. Rather, they usually 
address themselves to the question: 
given a specified set of operations, 
what are the best methods for accom- 
plishing them or making use of their 
results? Or they suggest mathe- 
matical methods that have been 
thoroughly developed and hence 
might be powerful tools, if appropri- 
ate. 

For example, Gilmore (1958) has 
discussed the use of Fourier analyses, 
Goodall (1960) and Greene (1960) 
have suggested quantum mechanical 
models, and Novikoff (1960) has 
suggested integral geometry. Frankel 
(1960), Pugachev (1960), and Thun 
(1960) have discussed optimum cod- 
ing from the standpoint of statistics 
and information theory; Chow (1957) 
has applied decision theory, and Gill 
(1959) has presented methods for 
finding a shortest scanning path. 
Marill and Green (1960) have demon- 
strated how well a good statistical 
recognition function might work and 
how it can be implemented on a 
computer. Gabor (1959, 1960) has 
developed learning filters that he sug- 
gests would be useful for pattern 
recognition. The vast amount of 
work in information, coding, and 
filter theory becomes relevant here. 
Uttley (1956, 1959) has also made 
attempts to apply his very general, 
interesting, and powerful model for 
computing conditional probabilities 
of simple Boolean functions of the 
input matrix. Mattson (1959), ex- 
tending this in a certain direction, 
has programed a self-organizing sys- 
tem that will discover the proper 
Boolean function for linear partition- 
ing of an n-dimensional Space into 
appropriate sets. Widrow and Hoff 
(1960) have simulated and analyzed 
networks of similar systems, Sebes- 
tyen (1961) has demonstrated the 
striking abilities, when applied to 
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speech recognition, of a method for 
transforming the space within which 
the patterns lie. 


Ranpom NETS 


In many ways closely related to 
these more formal mathematical anal- 
yses of the problem is the statistical 
analysis of the random neural net by 
Rosenblatt and his co-workers (Hay, 
Martin, & Wightman, 1960; Joseph, 
1960; Murray, 1959; Rosenblatt, 
1958a, 1958b, 1958c, 19602, 1960b). 
Rosenblatt has greatly extended ear- 
lier work done by Clark and Farley 
(1955) and Farley and Clark (1954) 
on the ability of randomly connected 
nets to become organized as a result 
of feedback modifications of firing 
thresholds consequent upon success 
or failure. Clark and Farley had 
demonstrated empirically that this 
method could succeed with a small 
set of neuron elements and simple 
differentiations. Rosenblatt has 
proved, or attempted to prove, a 
number of much more powerful 
theorems as to this sort of system’s 
performance in making a larger num- 
ber of differentiations over unlimited 
transformations on the inputs. These 
Proofs are unusually difficult to follow 
and have raised a good deal of skep- 
ticism and criticism (see, e.g., discus- 
sions following some of Rosenblatt’s 
papers). Several mathematicians 
have attempted to restate this math- 
ematical development (e.g., Joseph, 
1960; Keller, 1961; Kesten, 1959). 
Their results seem to indicate that 
the completely randomly connected 
Perceptron will be sensitive to dif- 
ferences in area and to nothing else. 

It is also difficult to follow Rosen- 
blatt's informally stated claims. But 
it would seem that his writings give 
a strong impression that he expects 
the random Perceptron to identify 
differences in formal characteristics of 
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inputs and even purports to have 
demonstrated this in simulations of 
simple cases. But a closer reading of 
his discussions and of the experiments 
performed leads this writer to the 
conclusion that he has not actually 
made these claims. Rather, he has 
demonstrated the moderate amount 
of success to be expected from a 
device of this sort, as indicated by 
Clark and Farley's work and by some 
of the empirical studies of the num- 
ber of matrix cells that are covered 
by all members of a class on the one 
hand, or by members of one class 
but not another on the other hand— 
studies that have led to the choices 
of crucial cells for identification, such 
as Bailey and Norrie's work (1957) 
and Baran and Estrin's (1960). But 
it seems likely that this success will 
diminish, not increase, as the number 
of classes to be identified is increased 
and the narrow limits imposed upon 
transformations of forms are relaxed. 
On the other hand, the theoretical 
development suggests that the powers 
of the Perceptron will greatly increase 
as size is increased and become en- 
tirely adequate to a complex per- 
ceptual task within reasonable size 
limitations. But this development 
demonstrates only the existence of a 
suitable Perceptron. Thus it may be 
valid only for some Perceptrons, those 
that are not completely random in 
organization, but rather have built-in 
important geometric constraints— 
chiefly a gradient of nearness. These 
constraints immediately seem to con- 
vert the Perceptron into quite a dif- 
ferent sort of machine; not at all a 
random net that organizes itself, but 
simply a spatially organized input 
matrix with local randomness. 

This organization does not seem to 
the present writer to be at all dis- 
advantageous. Rather, it seems es- 
sential, and eminently reasonable, 


since the most important character- 
istics of the inputs to be recognized 
are their formal characteristics. The 
random Perceptron immediately elim- 
inates these, destroying all informa- 
tion about local connectivity, slope, 
and curvature, Since many of the 
statements that can reasonably be 
expected of a machine that can 
recognize forms are about just these 
characteristics (e.g.: "Is it a broken 
or smooth figure?" "'Does it have 
sharp or curved angles?" “Is it made 
of straight or wavy lines?" “De- 
scribe what is next to the specified 
point."). It would seem that this 
machine is being posed an enormous 
problem of simply getting back to the 
point where it started. The incor- 
poration of a spatial gradient does, 
however, make the Perceptron a 
much less mystical and extraordinary 
machine. Although the Perceptron 
deliberately throws away almost all 
information about local connectivi- 
ties, it retains a certain amount of 
this information as a result of the 
convergence of randomly coupled 
sensory cells into a single association 
unit. This would seem to give the 
equivalent of a random sampling of 
weak interrelations, much like the 
n-tuple method that we will discuss 
below. 

Rosenblatt's Perceptron seems very 
similar to the hardware developed by 
Taylor (1959a, 1959b, 1959c; Barlow, 
1961). Taylor's machine has demon- 
strated some ability to discriminate 
between a few different figures, over 
a small number of transformations, 
with a very small and coarse input 
grid. Roberts (1960) has simulated 
a Perceptron-like net with random 
connections with fair success across 
sixletters. He introduced constraints 
as to nearness when he attempted to 
process the 26-letter alphabet. After 
40 learning experiences, his program 
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achieved 94% success on Sherman's 
input forms. The program in this 
version seems similar to l-tuple match 
programs such as Baran and Estrin's 
(1960), to be discussed below, and 
shows comparable results. It does, 
however, allow for a certain amount 
of identification of local geometric 
characteristics, and may be capable 
of more flexibility as a result. 


N-TUPLE MATCHING: Some RELA- 
TIONS BETWEEN DIFFERENT SIMU- 
LATIONS AND MODELS 

1-tuples 

A number of groups have made use 
of a very simple, almost prototypical, 
scheme for recognition: the examina- 
tion of individual spots. The basic 
method was probably first suggested 
for computers by Uttley (1954, 1956, 
1959) in his “conditional probability 
computer.” This is a device that 
keeps a running count of association 
between the states of each element 
and combination of elements in the 
matrix (or, this could be thought of as 
each operator on the input) and the 
proper name of each input. This is, 
very simply, a computer that will 
examine, when it also looks at com- 
binations of single elements, all pos- 
sible relations between input and out- 
put. Because of this major defect— 
—so that it can never hope to encom- 
pass so large a space—it is usable only 
when it computes probabilities for 
only the cells in the matrix taken 
individually rather than in combina- 
tions. It is, in a sense, an empty sug- 
gestion, in that it is merely an at- 
tempt at embodying the principle “do 
everything possible.” As we will see 
in a moment, as a result, all methods 
can be looked at in the context of this 
method. But we will defer this discus- 
sion until we come to one of the other 
methods, which has examined pairs 
and higher-ordered combinations of 
the individual cells (randomly rather 


than exhaustively chosen, so that the 
space of possibilities does not become 
overwhelmingly large), which is, obvi- 
ously, a member of this family. 

Uttley’s method is, simply, of a 
classic associationist sort. The anal- 
yses and results that have been ob- 
tained with it would seem to be quite 
interesting in illuminating the prob- 
lems (of size and search space) to 
which this method leads. 

Several of the programs we have 
discussed so far are obviously of the 
same type. Logics such as those un- 
derlying the explorations into best 
sets of single cells are attempts to 
choose crucial cells and to eliminate 
those cells that give no useful differ- 
entiation. But note that this is not 
doing anything more than the built-in 
process of the conditional probability 
computer would do. For as it gained 
experience it would be accumulating 
sets of probabilities for some of its 
cells that were essentially equal as to 
the different alternatives, or were 
redundant to the sets for certain other 
cells. These sets, then, would not, in 
the final computations, be contribut- 
ing anything to the choice. The 
essential difference is that the spe- 
cially chosen subsets are got if the 
designer divides his problem into two 
phases, first getting experience on the 
basis of which he chooses special cells 
for his machine, then building the 
special machine that he hopes will 
work on the sorts of things with which 
he has had experience. The machine 
that is built to accumulate its own 
experience cannot take advantage of 
the streamlinings in design that elimi- 
nate the inessential parts; on the 
other hand, it is capable of continu- 
ally modifying its probabilities and 
thus learning about and adapting to 
new inputs. And, in fact, the more 
general type of learning machine 
would be capable (though this has not 
been done as yet in published re- 
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search) of winnowing out its dead- 
wood by means of checks on discrimi- 
nating ability of individual cells and 
of redundancy between cells. 

Uttley's own work has not been 
able to handle realistic inputs because 
of the great size limitations imposed 
upon his computers by their being 
analog nets of switches and probabil- 
ity accumulators. Several attempts 
to use the simple conditional proba- 
bility method with single cells alone 
with the larger sets allowed by a digi- 
tal computer simulation have given 
surprisingly good practical results. 
'Thus Highleyman and Kamentsky 
(1960) have reported 7796 success in 
using a 12X12 matrix to process a 
wide range of distortions over 50 
alphabets (and correspondingly bet- 
ter results, over 99% correct, with 
the smaller distortions of the machine 
printing typically used in demonstra- 
tion studies. Baran and Estrin 
(1960) have used essentially the same 
scheme with comparable results as 
tested on machine characters. 

The random methods, such as 
Taylor's (1959a) and Rosenblatt's 
(1958a), are also quite similar to 
Uttley’s basic method. For their 
basic operation is to accumulate some- 
thing like probabilities on individual 
cells. They sometimes evaluate these 
probabilities in different ways—as in 
Rosenblatt's Perceptron, where a 
randomly chosen set of these cells is 
used to make a decision. Then, with- 
in these randomly chosen sets, cells' 
probabilities are tuned down or up 
and carry weight to the extent that 
they have been shown in the past to 
add information. 


2-tuples 

Bledsoe and Browning (1959) work 
with a 10X15 matrix (totaling 150 
cells) for their input stage. From this 


matrix the computer draws random 
n-tuples. In most of their published 


experiments these have been 2-tuples 
without replacement; but some of the 
most suggestive results, to which we 
will return in a moment, have been 
comparisons of n-tuples of differing 
sizes. There will thus be 75 2-tuples 
extracted from the 150 cells. Each 2- 
tuple can be in one of four states as a 
result of a stimulus symbol: either 
both cells are excited, Cell i is excited 
but Cell j is not, Cell j is excited but 
Cell i is not, neither cell is excited. 
Each symbol will, therefore, throw 
each 2-tuple into one of its four states. 
The computer program characterizes 
a symbol that it is asked to recognize 
by 75 statements, the one for each 2- 
tuple that tells which of the four states 
that 2-tuple is in. This set of state- 
ments is then matched up with the 
sets of similar statements, one for each 
symbol that the computer "knows" 
inits memory. The best match deter- 
mines the name it gives the input 
symbol. 

This method is equivalent to a 
crazy sort of template matching, in 
which two randomly conjoined spots 
define a template. Recognition is 
now effected not by a single different 
template for each item, with as many 
templates as there are items, but by 
M/n (in this case, 150/2=75) tem- 
plates for each item, but only M/n 
templates no matter how many items. 
Thus a different subset of these tem- 
plates will be the critical templates 
for each item and to some extent a 
different subset for each item in rela- 
tion to each other item. A set of 150 
1-tuples will, in certain respects, be 
more similar to a template with each 
1-tuple equivalent to the basic spot 
in the template and the total of the 
excited 1-tuples the equivalent of the 
integration. of the match between 
template and stimulus. 

The 2-tuple experiments gave 
rather good recognition. The 1-tuple 
experiments (150 1-tuples of two pos- 
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sible states) gave slightly poorer suc- 
cess. The larger n-tuple experiments 
gave greatest success. Highleyman 
and Kamentsky (1960) have shown, 
however, that this method does more 
poorly with a larger variety of inputs, 
so that it was inferior, on the same 
data, to their 1-tuple procedure re- 
ported above. But Uhr (1961) has 
pointed out that, everything else held 
constant, the 2-tuple method is 
bound to be at least as good as the 1- 
tuple method, so that with the im- 
proved decision making procedure of 
the 1-tuple method with which it was 
compared, it should again give the 
greatest success. 

The absolute meaning of these suc- 
cess estimates, typically given in per- 
centage correct, is almost as obscure 
as the meaning of the percentage cure 
figures given by different schools of 
psychotherapists, since no two differ- 
ent experimenters have used stimuli 
varied in any comparable way across 
their various dimensions. But a rough 
impression is that the results for 2- 
tuples may be as good as those 
achieved by the best alternate 
schemes. This seems interesting be- 
cause of the simplicity of the scheme, 
the almost self-organizing way in 
which the set of recognizing n-tuples 
is randomly generated, and the way 
in which their outputs are used, 
originally, to write the memory lists. 

There are several glaring faults 
with this scheme as a model for per- 
ception, but these seem to be easily 
corrected (although the physiological 
analog does not seem apparent), 
First, there will certainly be n-tuples 
that are not much good and n-tuples 
that are completely worthless—that 
is, n-tuples that are thrown into the 
same state by all stimuli or are always 
thrown into the same state (hence 
completely redundant to) as some 
other n-tuple. This could easily be 


eliminated by a subroutine that 
evaluated n-tuples, according to these 
or whatever other pertinent criteria 
the programer wished. More gen- 
erally, this is a question of the degree 
to which a specific n-tuple reduces the 
uncertainty in making a choice be- 
tween the set of symbols to be recog- 
nized, given the set of sets of n-tuple 
states excited by the set of stimuli to 
be recognized. Roberts (1960), in one 
of his attempts to strengthen the Per- 
ceptron, and Uhr and Vossler (1961a) 
in their method for replacing neural 
net operators, have done something 
similar. Clearly, this sort of sub- 
routine would improve performance 
and make the program more efficient; 
in some senses there would be “learn. 
ing.” It might also be used as a hill 
climbing method to choose a best set 
of n-tuples. 

In one respect, the experiment in 
which the size of an n-tuple was varied 
has done just this. Given 150 items, 
they work more powerf ully taken five 
at a time than two at a time. But 
which of the M/n n-tuples could be 
eliminated with the least loss in 
power; and would this loss in power 
be appreciable? How many could be 
eliminated? Of the M(M-1)/2 pos- 
sible 2-tuples, rather than of the M /2 
that happened to have been chosen in 
this particular experiment, which is 
the best set of 2-tuples? 

No plausible neural model could 
comfortably contain conjunctions of 
randomly drawn neurons or cones. 
But the simple constraint that seems 
obviously suggested by the neural 
model—that the cell be connected to 
the other members of its n-tuple— 
leads to a number of interesting con- 
Sequences. Consider the 5-tuple. In 
any column of the 10X15 matrix, 
there will be 11 possible connected 5- 
tuples that define a straight vertical 
line 5 cells long. In all 10 columns of 
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the matrix, there will thus be 110 of 
this sort of 5-tuple. One subset of 
these connected n-tuples, already ap- 
parent with 4-tuples or 5-tuples, but 
true toany degree of precision as n in- 
creases, will be the set of sets that 
define Bomba's and Unger's elements, 
many of Selíridge's demons, and 
similar basic operators. That is, a 
demon-like "curve on left open to- 
ward the bottom" is equivalent to the 
set of n-tuples that satisfy this defini- 
tion, which in turn is equivalent to a 
smaller set of n-tuples plus rules for 
their transformation or iteration in a 
neural net. The problem of choosing 
the best n-tuples is now equivalent to 
Selfridge’s problem of choosing the 
best demons. But it can now be done 
by an experimenting learning com- 
puter. And, clearly, it is possible to 
use Bledsoe's methods of random 
generation to create demons from as 
close to nothing as one could appear 
to get. 


NonvVISUAL PATTERN PERCEPTION 


The patterns of visual form percep- 
tion are only a small set of the pat- 
terns that have been examined for 
computer analysis, although by far 
the greatest amount of work has been 
done on them. People have for a long 
time discussed the desirability of gen- 
eral theories and methods for pattern 
analysis and recognition. Curve fol- 
lowing, filtering, searching, hill climb- 
ing, and optimizing are all examples 
of terms denoting mathematical ap- 
proaches of fair generality to pattern 
recognition problems. Much of the 
power of the geometry, logic, and 
game playing computer programs lies 
in their abilities to recognize and work 
with patterns—in assessing such 
things as "similarities" between posi- 
tions and moves, or "distances" and 
directions for traversing these dis- 
tances between desired results and 


present position (see Gelernter, 1960; 
Minsky, 1961; Newell Shaw & 
Simon, 1960; Samuel, 1959; Wang, 
1960a, 1960b). Any problem could be 
formulated in terms of the position 
from which the problem begins (the 
question, or the present state of 
knowledge) and the desired end posi- 
tion (the answer, the theorem), 
Whatever information that is avail- 
able about the question, about the 
possible directions toward an answer, 
and about the form of the answer, 
begins to sketch out a pattern. The 
problem, for mathematician, scien- 
tist, artist, perceiver, or game player 
is now one of filling in this sketch in a 
way that will eventually lead to filling 
in the path from question to answer. 
Put another way, the problem is one 
of mapping an input intoa proper out- 
put by means of operators appropriate 
to the input and the mapping prob- 
lem. But the patterns of the input 
and the problem (as "sensed" and 
“understood” by the computer) them- 
selves determine the universe of pos- 
sible and appropriate operators and 
output classes. 

The various sensory input modali- 
ties are quite obviously specific ex- 
amples, dependent upon specific 
transducers, of the problem of recog- 
nition of patterns presented by the 
physical world. Probably speech rec- 
ognition by machines is the most 
similar of the other pattern recogni- 
tion problems. Several groups of 
researchers have done quite extensive 
work on this problem, but their results 
and present recognition techniques 
are still relatively weak and primitive. 
Suitable transducing of the complex 
acoustical speechwaves to form suit- 
able for machine analysis is difficult, 
and extremely elaborate equipment 
development problems have kept 
workers from seriously attacking the 
more important and interesting rec- 
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ognition problems. A basic method 
for inputting voice signals has been 
to use narrow band-pass filters, up to 
40 on the frequency spectrum, and to 
sample these filters at short intervals, 
around 10 per second, although some 
devices use far less elaborate filtering 
techniques. Thus speech is put into a 
discrete matrix of the sort used for 
light-dark figures. But speech is not 
at all the same kind of thing. Inter- 
actions between greatly differing fre- 
quencies may be of far greater impor- 
tance; the third dimension of inten- 
sity, and more likely relative inten- 
sity, probably contains important 
information; and the underlying gen- 
erator of the speech waves—the hu- 
man's voice mechanism—may deter- 
mine, and be the key to, the important 
patterns. At any rate, work on the 
discretized input has hardly begun, 
since the most sophisticated systems 
today (such as those developed by 
David, Mathews, & McDonald, 1958; 
Denes, 1959; Dersch, 1960; Dudley & 
Balashek, 1958; Forgie & F. orgie, 
1959; Hughes & Halle, 1960; Petrick 
& Willett, 1960; Stevens, 1960) have, 
as their most powerful operators, first 
difference assessors—the equivalent 
of the very weakest analytic methods 
of the visual pattern recognition pro- 
grams. Nor has any work at all been 
done on programing speech recogni- 
tion machines to choose operators or 
tolearn. This seems unfortunate, be- 
cause the pattern recognition prob- 
lem of operating on the discretized 
input is very similar to the problem 
for visual inputs and a lot of experi- 
ence with the latter might be immedi- 
ately applicable. And it seems likely 
that examination of the recognition 
problems would greatly clarify the 
transduction problem, since it is not 
at all clear at present what elements 
of the speech wave contain the rele- 
vant information. Sebestyen (1961) 


has, however, demonstrated surpris- 
ingly good results (on spoken num- 
bers only) using his transforma- 
tion method on extremely coarsely 
sampled speech. 

One very powerful concept of 
speech recognition, that has already 
been applied with great promise to 
visual handwriting recognition (Eden 
& Halle, 1960), is the linguist's unit, 
the phoneme. The phoneme suggests 
à basic alphabet of invariant units for 
decomposition and description, and 
Eden and Halle have suggested a 
very simple basic alphabet of curves 
and loops into which curves of the 
sort we typically find in handwriting 
can be decomposed. These, in fact, 
are very similar to the elements of the 
powerful analytic programs. 

Several programs have been written 
to recognize patterns of Morse code 
(Blair, 1959; Gold, 1959). These have 
attacked a relatively simple problem, 
and have run into surprising diffi- 
culties in analyzing the relations 
between lengths of different dots 
and dashes, wherein a great deal of 
information, with human opera- 
tors lies. Still other patterns, such 
as electroencephalographic tracings 
(Farley, Frishkopf, Clark, & Gil- 
more, 1957), bubble chamber patterns 
(Innes, 1960), star counts (Borgman, 
1959), bacterial counts (Mansberg, 
1957), and abnormal leucocytes (Pres- 
ton, 1961), are being processed with 
martine pattern recognition meth- 

s. 

Several programs have attacked 
very simple problems of recognition 
of patterns of more obviously mathe- 
matical sorts. Hagelbarger (1956), 
and earlier Shannon (reported in 
Hagelbarger), built simple machines 
to discover sequences of ones and 
zeros. Kochen and Galanter (1958) 
ran experiments to see how well hu- 
man beings function as computers of 
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this sort and proposed a program that 
would simulate the results of their 
experiments. Kochen (1960, 1961) 
has since reported results of programs 
with abilities to learn such series and 
has specifically applied this method 
to pattern recognition. Foulkes 
(1959) has programed several more 
sophisticated machines of this sort, 
and Watanabe (1960) has programed 
his inductive inference model to ex- 
tract similar patterns. Here, simple 
1-dimensional strings, sequences of 
one-bit numbers, are the patterns to 
be recognized. An interesting aspect 
of these endeavors is that the patterns 
can continue over time, in which case 
the machine must keep a running 
assessment over its entire history, 
extract special patterns from the con- 
tinuing string, and be sensitive enough 
to the short run event to be able to 
make adjustments to changes. These 
are things that need to be, and could 
be, built into other pattern recogni- 
tion machines; but this has not as yet 
been done. The very simplicity of 
this problem, with its one-dimensional 
analysis, may also make it a more 
easily manageable vehicle with which 
to develop basic methods than the 
higher-dimensioned pattern recogni- 
tion problems. Rochester, Holland, 
Haibt, and Duda (1956) have pro- 
gramed Hebbian cell assemblies and 
tried to develop them as pattern 
carriers with a small amount of suc- 
cess. This is an interesting test of a 
theory and a curious contract of a 
random pattern recognizer trying to 
do something extremely simple for a 
different kind of computer. 
Linguistic analysis at its most 
primitive level can also be thought of 
asa one-dimensional pattern recogni- 
tion problem in which the string of 
words forming a sentence is a gram- 
matical pattern. A good bit of ma- 
chine computer translation of foreign 


language treats words in obvious 
primitive ways according to likely 
patterns, More basic studies of gram- 
matical patterns (Chomsky, 1957) 
are beginning to treat the corpus of 
utterances at a more general level, in 
this way, attempting to generate 
sentences and to transform sentences 
by means of underlying rules about 
syntactical patterns. But at this 
level the problem is obviously a many 
dimensioned one and probably of 
even greater complexity than the 
problem of visual pattern recognition. 


EXPERIMENTAL EVALUATION OF 
PATTERN RECOGNITION PROGRAMS 


Problems and Results 


Most of the rather large number of 
programs written for pur- 
poses into both pattern recognition 
and human simulation have made use 
of much the same sorts of relatively 
weak analytic logics used in the com- 
mercially built machine that achieves 
an adequate level of success within a 
carefully chosen and limited domain. 
But the specific characteristics they 
analyze have usually been chosen 
with more general classes of char- 
acters, or more purely theoretical 
considerations, in mind. But it is 
hard to evaluate these different 
methods, for there is no good con- 
ceptualization of the problem—either 
as pattern recognition by machines, 
form perception by animals, or in- 
variance over admissible transforma- 
tions as analyzed mathematically. 
Nor have the few running programs 
that are relatively successful been 
evaluated in any sort of systematic, 
experimentally defensible way. 
Rather, each researcher has chosen 
an arbitrary set of test cases as in- 
puts, almost always with severe re- 
strictions as to distortions and with 
the weaknesses of his own program 
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clearly in mind, to demonstrate the 
positive aspects of how well his 
methods can work. In most cases this 
has not been the researcher's fault. 
For there is no clear conceptualiza- 
tion of the problem even at the con- 
crete level of what the total array of 
patterns that a program should 
recognize might be, much less at the 
level of their underlying dimensions. 
Thus no one has even attempted to 
specify the ranges and distributions, 
in various dimensions, of inputs that 
might confront a program, and then 
assess where, in this complex space, 
his program might succeed and fail. 
Instead, people have almost always 
built up a small showcase of (often 
prearranged) successes, discussed 
some failings to be found in the pro- 
grams of other researchers but not 
theirs, and discussed failures that 
would occur with any but the deepest 
of perception computers. This seems 
unfortunate, because at least the 
beginnings of objective evaluations 
could ,be made. Programs could 
easily be compared with one another 
by the simple expedient of testing 
them all on the same array of inputs. 
Inputs could be classified at least 
roughly along such dimensions as 
thickness, noisiness, regularity, uni- 
formity of size, translations, rotations, 
variety of typefaces, local variability, 
geometric variability, topological 
variability, completeness of figures, 
so that the limits of a program could 
be more clearly specified along these 
dimensions. Young (1960) has at- 
tempted to do something in this direc- 
tion by testing the limits of distor- 
tions within which people still iden- 
tify letters. 

A few worthwhile beginnings to- 
ward experimental tests have been 
made. Kirsch (1959) has made a plea 
for comparative tests, has suggested 
that input data be accumulated for 


public use, and has offered researchers 
in this field the use of the Bureau of 
Standards' input preparation equip- 
ment. Highleyman and Kamentsky 
(1959) have begun a program of com- 
paring alternate pattern recognition 
methods, and have already (1960) 
published results of a replication of 
Bledsoe and Browning's work. Sher- 
man's input patterns are being used 
by Selfridge and his co-workers in the 
testing of the Pandemonium program 
and by Roberts (1960) in testing his 
program, so it seems likely that there 
will be some basis for comparison 
between these three different pro- 
grams. 

A brief and necessarily unsyste- 
matic comparison between results 
reported for some of the different pat- 
tern recognition programs that we 
have discussed should be of some in- 
terest in illuminating the degree of 
unsystematization in this research 
enterprise as of today, the problems 
that have resulted, and, possibly, 
something about the relative powers 
of the different approaches, The 
methods that will be compared are 
chosen (necessarily arbitrarily) be- 
cause they (a) are among the few that 
have been sufficiently tested so that 
results extensive enough to be com- 
pared have been published, (5) repre- 
sent interesting approaches, or (c) 
have been relatively widely publi- 
cized. 

The most advanced commercial 
machines have been reported to rec- 
ognize a large variety of patterns—at 
least the 80-odd present on a type- 
writer keyboard (including both up- 
per and lower case letters)—over a 
variety of type fonts. It is hard to 
judge what this variety of fonts might 
be. Itisalmost certainly smaller than 
the variety of handprinted letters as 
made by different human hands. The 
variety can, however, be made arbi- 
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trarily great by the standard ad hoc 
method of this type of machine—the 
addition of a new, alternate logic or, 
as in the case of some machines, the 
use of a different set of template 
filters for each different type font. 
Thus, with relatively clean printing 
on clean paper, these machines give 
extremely high percentage correct 
rates (95-99.99%) for a large number 
of narrowly restricted patterns. Other 
programs of this sort (Sprick & 
Ganzhorn, 1960) are able to do well 
with degraded inputs (smudged let- 
ters), by means of stronger logic 
(curve assessments) and, as reported 
to date, over far fewer alternatives 
(the 10 numbers). 

It is not at all certain that the more 
sophisticated methods can do better 
than, or even as well as, these ma- 
chines. However, they are normally 
tested on handprinting which is far 
more difficult for the computer to 
recognize than machine type. Doyle 
(1960) has tested the Pandemonium 
on only a 10-letter alphabet, over a 
wide variety of handprinted letters 
(printed by different people), with 
about 8595 success.  Neisser and 
Weene (1960) have run experiments 
on the abilities of people to identify 
similar letters correctly, showing that 
they achieve about 95% correct re- 
sponses. This is somewhat better 
than the program, and, of more inter- 
est, gives an indication of the rela- 
tively great amount of distortion in 
these inputs. Sherman, using inputs 
from the same source, on a different 
(and possibly more difficult) subset 
of 10 letters, achieved 83 successes, 6 
failures, and 31 “no answers” (that is, 
the machine did not make any choice 
because nothing had been programed 
in to handle the topological or geo- 
metric features encountered in the 
input). These results are difficult to 
evaluate. As they stand, the program 


achieved 69% correct. If we assume 
that additional rules that lead to 
decisions in all cases would give simi- 
lar percentage correct for the unde- 
cided inputs (like undecided voters) 
then we might estimate a 93% correct 
figure. 

Bledsoe and Browning (1959) re- 
ported 7895 correct, over five 36- 
letter alphanumeric alphabets hand- 
printed by the same person, with 
their method using 2-tuples and 
slightly better results using 5-tuples. 
This seems surprisingly good consider- 
ing the simplicity of the program. 
Highleyman and Kamentsky have 
shown, however, that the success rate 
goes down markedly with a wider 
range of inputs (50 examples of each 
letter, in their case, made by 50 
different people, so that their test 
was probably relatively similar to 
Sherman's and Doyle's) to 20% cor- 
rect. Bledsoe and Browning (1961) 
ran further tests with this larger range 
of inputs (supplied to them by High- 
leyman and Kamentsky). They got 
87% success with 8-tuples and 98% 
success with 12-tuples when reproc- 
essing learned figures; but 40% suc- 
cess with 8-tuples when processing 
new figures. They got 100% cor- 
rect results on machine printing. 
Surprising success was got by High- 
leyman and Kamentsky using, es- 
sentially, Uttley's scheme, comput- 
ing probabilities for each cell in the 
matrix, and cross-correlating. This 
was their method of assessing the 
Bledsoe and Browning method and, 
with the same inputs, they achieved 
77% success. This may well be as 
good a success rate as Doyle's (1960) 
or Sherman's (1960), since the range 
of distortions seems comparable, and 
this program was asked to differenti- 
ate 26 letters, whereas their programs 
were asked to differentiate only 10. 
Roberts (1960), using inputs from 
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the same source as Sherman and 
Doyle, got the best results—94% 
correct over the entire alphabet. 
Baran and Estrin (1960) got higher 
success rates, approaching 99%, but 
on a much more restricted range of 
inputs (the same typewriter was 
used, but the ribbon allowed to de- 
teriorate). Uhr and Vossler (1961a) 
got 96% success over the entire al- 
phabet using handprinted letters dif- 
ferent from the ones from which the 
program developed its measures and 
characterization lists. They also re- 
port success in processing a wide 
variety of inputs, such as simple line 
drawings, cartoon faces, spoken num- 
bers, and meaningless patterns (Uhr 
& Vossler, 1961b; Vossler & Uhr, 
1961). 

Grimsdale, Sumner, Tunis, and 
Kilburn report their tests very in- 
completely. But they give some indi- 
cation that they examined a large 
array of patterns (including, e.g., 
Greek letters) over at least as many 
distortions as any of the other pro- 
grams with the important addition of 
large rotations (which their program 
is especially well equipped to handle). 
They do not report percentage re- 
sults, but they give the impression 
that their program was almost al- 
waysright, probably with percentages 
(if enough inputs had been run) ap- 
proaching any of the others. 

Rosenblatt tested five different 
letters approaching 10095 accuracy 
after learning. Translation was al- 
lowed, but no other distortions or 
variations were allowed. (Some of 
the other programs did not have sub- 
routines to handle translations, but 
simple algorithms for this exist—such 
as finding centers of gravity or en- 
velopes or leftmost and rightmost 
filled cells.) In other respects, his 

inputs were quite restricted in va- 
riety, more restricted than any but 


those used for the commercial ma- 
chines. When 26 letters were learned, 
absolutely no variations were allowed 
(Hay et al., 1960). 

None of these methods has been 
used for figures with irregularities, 
such as nonconnected segments, or for 
dotted or implied figures. Rosen- 
blatt's and Doyle's programs would 
be able to handle these to some ex- 
tent; the programs of Grimsdale et al. 
and Sherman definitely would not, as 
presently programed, since they rely 
on following connectivities. Al- 
though several make use of experience 
with inputs in ways that might be 
characterized as “learning,” few are 
adaptive in the important sense that 
they would learn to modify previ- 
ously learned methods as a function 
of changes in the problem, as pre- 
sented by new sets of inputs. Some 
would still be able to succeed to the 
extent that new inputs were similar 
to old (see studies by Doyle, Rosen- 
blatt, Taylor) and some would modify 
themselves slowly over time, in their 
simple probability accumulation 
tables, in which new patterns would 
override the old (see studies by 
Bledsoe, Browning, Rosenblatt, 
Taylor) Although the program of 
Grimsdale et al. at its present stage 
of development would fall down 
badly on these, it might do the best of 
all once it was programed to write not 
rigid lists but lists based on averages; 
and, even better, to Switch into writ- 
ing new lists when it finds itself mak- 
ing frequent errors. 

The most striking picture that 
emerges is the relatively good results 
from the simplest of schemes. The 
commercial machines have a very low 
error rate, even with degradation, on 
printed inputs that may well range 
over a larger variety of styles (with 
the different fonts) than we realize. 
The 1-tuple method as implemented 
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by sophisticated probability counts 
and correlation or decision making 
techniques (see studies by Baran & 
Estrin, Highleyman & Kamentsky) 
already does better on the more vari- 
able input sets, possibly as wide a 
variety of inputs as those used to test 
any other method, and this has been 
very clearly demonstrated in the 
comparison experiment with Bledsoe 
and Browning's 2-tuple method. But 
the Bledsoe and Browning method 
may do as well as the analytic ap- 
proaches of Selfridge and Doyle and 
of Unger. In these programs, only 
inadequate or restricted tests have 
been made. The program of Grims- 
dale et al. might be considered as 
having the greatest promise, but al- 
most any of these programs, if imple- 
mented further, would improve. 
Roberts’ results may well be the best 
of all with 94% correct over a wide 
range of inputs. It would not seem 
unreasonable to conclude that the 1- 
tuple method, a method that cannot 
possibly be as good as the more power- 
ful methods, has, as of today, given 
the best results. Nor is there much 
concrete evidence that anyone has 
been able to improve upon the abili- 
ties of random—randomly chosen 
operators—despite the large amount 
of analysis and experimentation effort 
expended. 


Discussion 
Discrete Analyses 


All pattern recognition schemes 
quantize a continuous input into a 
relatively small set of discrete out- 
puts. The analog methods still main- 
tain continuity within each member 
of the output set (that is, for example, 
for each curve segment followed); 
but they similarly work with discrete 
members and, in the actual decision 
process, whatever cutoff rules are in 
force at the moment act again in a 
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discrete manner, The input as sensed 
determines the entire universe within 
which mappings can be done by any 
scheme and, in fact, also determines 
the mapping rules (operations) that 
are possible, For the vast majority of 
pattern recognition programs, and 
also for classical theories of percep- 
tion, the input as sensed is a discrete 
mosaic, as in a television raster, 
internally stored computer matrix, 
or array of retinal cones. The analog 
methods similarly break the input up, 
into curve segments or radial dis- 
tances, or whatever other function, 
and thus similarly define the universe 
of operations. Note of course that the 
word "analog" has several meanings; 
we are here discussing it in the sense 
of being a one-one transformation of 
the input. And the analogy can also 
be complete or partial—in respect to 
something. Thus the continuous 
variation of energy or probability 
within a discretized cell spot is a 
particular aspect where analogy re- 
mains in many otherwise thoroughly 
discretized schemes. 

Looking now more carefully at our 
framework, we see that the 1-tuple 
method, as exemplified by the work 
of Uttley or Mattson or Baran and 
Estrin, makes use of the single set of n 
unconjoined input elements them- 
selves. The Bledsoe and Browning n- 
tuple method makes use of randomly 
chosen 1-tuples, 2-tuples, 5-tuples, or 
n-tuples, but, because of the way 
they are chosen, only a small set of 
the possible set (as programed so far, 
only N/n). It thus attempts to get 
around the enormous proliferation in 
the set of possible cell combinations 
as higher-order combinations are 
admitted, in the way Uttley origi- 
nally proposed, by making a random 
sampling of a restricted set of sets. 
When 2-tuples are used, it does some- 
thing very similar to what, in a sta- 
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tistical pattern analysis, DuMas' D 
statistic does in relation to Kendall's 
tau. Just as D computes pattern 
similarity by taking into considera- 
tion only the subset of differences 
between adjacent pairs from the set 
of all differences between pairs (ad- 
jacency usually being an arbitrary, 
essentially random, attribute) where- 
as tau considers all pairings, so 
Bledsoe and Browning choose N/2 2- 
tuples at random from the entire set. 
Most other pattern recognition pro- 
gramsare, essentially, specially chosen 
sets of n-tuples and of subsets of the 
set of possible logical statements 
about these n-tuples. 

The restriction on looking merely 
at 1-tuples, which is guaranteed to 
ignore any geometric nearness rela- 
tions, can be overcome by choosing 
different kinds of n-tuples. For ex- 
ample, Unger and Bomba choose 
connected sets of cells, and then sets 
of these cells that move the geometric 
quality being sensed across certain 
specified transformations. Selfridge 
and Doyle do this and also use strings 
of cells (lines) across the matrix with 
which, effectively, counting is done. 
Connected n-tuples look at sets that 
are known to be geometrically and 
topologically related. The analytic 
methods that I have termed strong, 
the ones that look at topology and 
geometry, compute the sets of sets of 
n-tuples as functions of one another, 
in addition to making decisions in a 
sequential manner. 

At the other extreme in that it 
makes use of even less information 
than the 1-tuple method, the template 
method discretizes the matrix into 
template versus not template (or, 
when subparts are looked at, each 
subtemplate versus not that sub- 
template). It thus looks at and inte- 
grates across the equivalent of all of 
the 1-tuples that the template covers, 


It is, then, the equivalent of the 1- 
tuple method that makes no use of 
probabilities within each cell, but 
merely considers the two possibilities 
of match and mismatch. 

Why, then, does the template 
matching method work so well? The 
question would rather seem to sug- 
gest that the other more powerful 
methods are almost certain to lead to 
even more striking success when they 
are refined to the same extent. For it 
seems likely that the simplicity of the 
template method in terms of machine 
realization has led to sufficiently 
greater precision to more than over- 
come its theoretical weaknesses, 

There is, however, another pos- 
sibility that should not be ignored. 
The template method cannot be 
superior to the other methods, just 
as the 1-tuple method cannot be 
superior to the 2-tuple method. But, 
on the other hand, even though it is a 
lower bound, it might not be inferior, 
but rather they all might simply be 
equal in performance. Conceptually 
this does not seem likely, for from 
this would follow that joint occur- 
rences carry no information, and 
things such as nearness and inter- 
dependencies are of no consequence 
to patterns. A final, more likely pos- 
sibility is that the weaker methods 
may be good enough to solve the 
problems posed by visual patterns. 

Each pattern recognition program 
or theory is a hunch as to which sets 
of elements carry the information. 
Theories developed within psychology 
and physiology usually have in mind 
certain restricting conceptions as to 
plausible sets. But, in fact, the ma- 
chine developments only rarely seem 
less plausible or based more heavily on 
unsubstantiated assumptions as to 
mechanisms. Rather, they all feel 
like very similar sorts of theories. 

The mass of pattern recognition 
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programs has, then, been a vast and 
disorganized running test of which 
subsets of the entire universe should 
be used. The abilities of template 
methods and Bledsoe and Browning's 
method compared to any other that 
has been tested so far raise the un- 
pleasant possibility that all of this 
search may, so far, have led no place. 
It also suggests that a better way to 
search might be within a single pro- 
gram that would itself be doing the 
equivalent of running experiments, 
making its choice of operators, modi- 
fying and improving upon this choice, 
as a function of the outcomes of its 
own experiments. 


Gestalt Analyses 


Probably the most attractive con- 
ceptualization of the pattern recogni- 
tion problem, at least to the psycholo- 
gist, is that the interrelated parts, the 
pattern or gestalt, must be analyzed. 
When examined from this point of 
view, the various programs once 
again seem to fall into fairly nice 
order. Template matching looks at 
one example of a pattern, the equiva- 
lent of one solution to the underlying 
set of equations from which all ex- 
amples of the pattern class could be 
derived. This is a gestalt, but only a 
single fragile gestalt, one that will 
serve only to the extent that the range 
of solutions does not seriously overlap 
the ranges of solutions to other pat- 
tern sets being differentiated. The 
1-tuple matching method makes the 
least use of interrelations between 
parts of patterns, since it looks only 
at individual cells at a time. It makes 
a hidden use of these characteristics, 
however, since the probabilities that 
it accumulates in each cell are them- 
selves based upon covariations of 
Sensed spots in inputs that have built 
into them, because they are them- 
selves gestalts, constraining inter- 


relations. The 2-tuple and n-tuple 
methods take advantage of this fact, 
by accumulating probabilities for 
randomly sampled combinations of 
points on the matrix, and hence on 
the pattern—so that the correlations 
between points are being more di- 
rectly accumulated. The Perceptron, 
if its thresholds adjust to the proper 
level, is capable of doing something 
similar, since it gives positive re- 
sponses as a function of association 
area cells that are connected to input 
cells (usually more than one) that 
respond to one class but not to the 
other classes of patterns. 

Whereas n-tuple and Perceptron 
make use of random samplings of 
gestalt interrelations, and inevitably, 
then, mostly nonconnected interrela- 
tions, most of the analytic methods 
that use sets of operators that are not 
themselves interrelated look at rela- 
tively local gestalt characteristics. 
Thus, using Unger's (1959), Bomba's 
(1959), and Doyle’s (1960) demons as 
examples, angle and curve sensing 
operators are looking at only a part of 
the pattern. Some of the operators, 
such as Selfridge’s line counters, are 
looking at prechosen aspects of the 
total pattern that, in these cases 
seem to be chosen because they re- 
flect pattern qualities in the whole. 
The strong analytic methods, as 
exemplified by the program of Grims- 
dale et al. (1959), look at both local 
and distant gestalt qualities, in what 
would seem to be the most coherent 
and directed search for larger and 
larger patterns. For, although they 
decompose patterns into their simpler 
concatenated parts, the resulting list 
of separate elements is looked at ac- 
cording to the proper gestalt relations 
between the parts. This becomes so as 
a result of the topological listing of 
these elements, which relates them 
one to another, and of the common 
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measurement unit by which they are 
characterized—which can be either 
the matrix framework itself, as in 
Uhr's (1959), or a specific one of the 
elements, as in Attneave and 
Arnoult's (1956), suggestion. These 
methods, then, continue to make use 
of interrelations, and are, essentially, 
searchers after interrelations. Their 
fundamental problem is to find which 
parts and aspects of the pattern con- 
tain the crucial information—that is, 
which interrelations, and which types 
of combinations of these interrela- 
tions, to look at. 

If this analysis is correct, then it 
would seem most sensible to do basic 
experimentation with strong analytic 
programs. Their disadvantage is the 
difficulty of programing them and 
their slowness in running time. But 
their great advantage is that they 
process all the information in a co- 
herent and meaningful way. They 
could therefore be used (as they have 
not been used so far) as experimental 
devices for comparing different sets of 
rules for interrelating and combining 
parts. In this context, judicious com- 
binations of the simpler local opera- 
tors, such as the crucial single cells as 
identified, for example, by Evey's 
(1959) experiments, and the crucial 
n-tuples, such as those that are almost 
certainly carrying the burden of work 
in Bledsoe and Browning's (1959) 
and Rosenblatt's (19582) programs, 
may be developed. For these simpler 
operators can now be chosen by ex- 
amining those analyzed features of 
the pattern sets that carry the most 
information and then choosing sets of 
the more easily programed or wired-in 
simpler operators that will approxi- 
mate these analyses. 

The powerful analytic program 
should be considered an approxima- 
tion to an analog machine. For the 
most important class of computers 


about which it can give us new in- 
sights is the analog computer, one 
that could perform, now quite simply, 
the analyses that are so cumbersome 
on the digital computer. Thus this 
experimental approach should be 
equally pertinent for improving our 
understanding of promising analog 
methods, such as those oí Kaz- 
mierczak (1960) and Sprick and 
Ganzhorn (1960). (And some of the 
analog transducing methods, espe- 
cially the resistor network, curve 
follower, and frequency accumulating 
capacitor, if hooked onto the digital 
computer, should materially simplify 
its operation.) 

Adequate comparison experiments 
between these alternate methods 
would be major steps toward crucial 
tests of alternate theories of percep- 
tion. We are in a good position to 
make these tests, once we have ex- 
amined the dimensions of patterns 
along which comparison should be 
made, and once we have become clear 
as to which are the independent as- 
pects or methods that go into a pat- 
tern recognition scheme. 


Models of Cognitive Processes 


If the analysis that has been made 
above to show that mere template 
matching or 1-tuple matching cannot 
be sufficient as a mechanism for hu- 
man perception is specious, then we 
may, with this array of pattern rec- 
ognition programs, be approaching a 
position where we will have an over- 
abundance of workable theories, 
rather than none at all. In any case, 
we would seem to be in the position 
where we might try to fit what are on 
the surface functional models (except 
for the few pattern recognition pro- 
posals that have combined function 
with neurological analogs), but very 
frequently have neurophysiological 
interpretations, to functional data, 
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and, further, physiological models and 
data, and models and data for other 
functions (such as nonvisual pattern 
perception, learning, concept forma- 
tion, and problem solving). These 
models gain the richness of Hebb's 
approach, but without any loss in pre- 
cision or decidability, since they are 
tested on the computer. Nor do they 
need to postulate possibly nonexistent 
and nonbuildable entities, such as the 
earlier Hebbian cell assemblies. 
Rather programs of this sort are 
essential to the enterprise of trying to 
write and test such a theory. Fur- 
ther, when we reach the point where 
we can make meaningful comparisons 
between relatively well developed 
programs, we will be in the position to 
examine critically and to compare 
entirely different approaches that are 
the equivalents of different higher- 
level theories: for example, random 
organization as embodied in Rosen- 
blatt’s or Hebb's theories versus or- 
ganized operations of the Pandemo- 
nium or topological sort. 

A strong argument can even be 
made for the relevance of pattern 
recognition for learning and concept 
formation in machines. The many- 
one mapping that must be effected is, 
to the extent it succeeds, a building 
up, through abstraction of the char- 
acteristics of the output sets, of a set 
of concepts. Here may well lie the 
difference between trial and error, 
conditioning, and random types of 
learning on the one hand, and gestalt, 
insightful learning on the other. The 
concept-formation effected by the 
identification of a pattern builds up 
tools that the organisms or machine 
can use to learn in much more power- 
ful and efficient ways. Thus we know 
that mathematics is successful only 
because of the very powerful concepts 
and operators it has built up. If the 
mathematician were to attack every 


new problem as one of learning to 
organize sets, he would have to re- 
capitulate the whole history of his 
discipline every few minutes. Simi- 
larly we know that ordinary people 
perceive new problems in the frame- 
work of their experience and knowl- 
edge and try to handle them in a 
directed way. Thus the difference be- 
tween a child who randomly applies 
methods to a mathematical problem, 
using trial and error, and one who 
applies the underlying concept (e.g., 
“add 1 and see whether the number is 
now divisible by a and b" versus 
"multiply a X b") is the difference 
between a machine that does not 
recognize the patterns in its previous 
experience, for subsequent use, and 
one that does. At this very general 
level, of course, the concept of pat- 
tern becomes extremely abstract, and 
possibly, of little help for our present 
attempts to solve specific pattern 
recognition problems. But in reality 
the concept “pattern” may be sim- 
ilarly vague in the more specific con- 
texts where it is traditionally used. 
And, although it is questionable 
whether an attack on the general pat- 
tern recognition problem would be 
profitable at the present time, given 
our lack of understanding as to the 
underlying common characteristics, it 
seems helpful to keep in mind the 
basic importance of pattern recogni- 
tion for computer intelligence and of 
perceptual learning for human intel- 
ligence. The problem of pattern 
recognition is the problem of discover- 
ing, making use of, and responding 
to organization. e 
Most of the pattern pean ige 
rograms that have been written to 
paraiso attacked only the limited 
practical problem of recognition of 
the printed English alphabet. This 
may be an unusually easy, or an un- 
usually difficult, specific instance of 
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the more general problem of form 
perception. No one has made a close 
enough examination of the matter to 
allow us to do more than guess, but 
it seems unlikely that the alphabet is 
either. And, in fact, when people 
have tested programs of this sort on 
different pattern arrays—such things 
as geometric figures, embedded pat- 
terns, aerial photographs, handwrit- 
ing, line drawings—they still discover 
at least a certain amount of ability 
to handle these things. The problem 
of testing such programs for their 
power in modeling a human function 
is also a major one. Not only because 
of the technical difficulties of estab- 
lishing the relevant dimensions of be- 
havior and adequate experiments for 
sampling this behavior, but more 
importantly because of the deeper 
problems of the hypotheco-deductive 
method: establishing our criteria for 
pertinent evidence. Possibly the 
major reason that the testing of such 
models seems to be such an unusual 
problem is that they present such a 
rich and varied description of the 
function that there are too many 
things totest, too many ways in which 
the model predicts nature, might be 
refuted by test, or is capable of 
undergoing perfectly legitimate theo- 
retical modifications. 

Often these programs do not at- 
tempt to handle important phenom- 
ena, or sidetrack the problems of 
interest to the psychologist by han- 
dling them in trivial ways. Thus, for 
example, size-constancy can be ob- 
tained by drawing a mask around a 
pattern and normalizing the mask; 
contextual information can be han- 
dled by storing large lists garnered in 
the most pedestrian tradition of total 
recall from previous experience and 
simply reflecting what was the case 
before. But phenomena such as size 
and object constancies, interactions 


between figure and ground, a dy- 
namic memory, learning and adapta- 
tion, can all be studied in the context 
of programs such as these. In many 
instances, rudimentary abilities to 
handle a new phenomenon that the 
psychologist would argue was of vital 
importance could be given to the 
program in nontrivial, not unrealistic 
ways. When such additions serve to 
strengthen or simplify the existing 
program, they immediately become of 
service toward our understanding of 
intelligence in machines. When pro- 
grams modified in such ways are sub- 
jected to tests of hypotheses, they 
may show capabilities of exhibiting 
a wide variety of new and possibly 
interesting behavior. When run along 
side a program of parallel research on 
human subjects, which examines be- 
havior that can be compared with the 
behavior of the computer program, 
this offers at least the possibility of 
making the computer a vehicle for 
exploration into very complicated, 
and possibly very rich and powerful, 
theories of behavior. 
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LEARNING IN FLATWORMS AND ANNELIDS' 


ALLAN L. JACOBSON? 
University of Michigan 


This article attempts an exhaustive review of the research purporting to 
demonstrate behavioral modifications in earthworms, planaria, and re- 
lated organisms. Studies are grouped first according to phylum, and for 
each of the phyla considered according to certain subcategories of learn- 
ing: habituation, classical conditioning, instrumental learning, and 
variability. Examination of the literature reveals that whereas earlier 
work was often ill-controlled, more recent research has for the most part 
been rigorous and convincing. It is concluded that learning and related 
phenomena have indeed been demonstrated clearly in each of these 2 
phyla, and that research on these animals provides a promising means 
of investigating the “molecular” basis of learning. 


Subhuman organisms are indis- 
pensable in the study of behavior. 
Lower animals often provide, in addi- 
tion to greater opportunity for con- 
trol, unique structural or functional 
properties which make them espe- 
cially suited to the investigation of a 
given problem or problem area. For 
example, goldfish are ideal subjects 
for the study of interocular transfer 
because their optic nerves are com- 
pletely crossed (see McCleary, 1960) 
and pigeons are employed in much 
operant conditioning work because of 
their ability to peck rapidly and for 
long periods of time. 

Perhaps as much as 90% of the 
research on animal learning has em- 
ployed the rat as subject (Bitterman, 
1960). Of recent years, however, 
there has been a lively and increasing 


1 This review was undertaken while the 
writer was a Public Health Fellow at the Uni- 
versity of Michigan. Various publication and 
translation costs were paid for by funds from 
grants to James V. McConnell by the Atomic 
Energy Commission and the National Insti- 
tute of Mental Health. 

The writer is indebted to McConnell for his 
invaluable assistance in the preparation of 
this review and to Edward L. Walker for his 
critical reading of the manuscript. 

2 Now with Bio-Organic Chemistry Group, 
Donner Laboratory, University of California, 
Berkeley. 
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interest in the learning capacities of 
the group Vermes—the worms—or, 
more specifically, the phyla Platy- 
helminthes (flatworms) and Annelida 
(segmented worms). The study of 
such organisms can yield information 
both about the generality of principles 
derived from the study of “higher” 
organisms, and also about the physio- 
logical basis of learning in these, and 
perhaps other animals. 

This paper is a review of the re- 
search on learning in worms including 
recent developments. Each study has 
been placed in one of several cate- 
gories: Habituation, Classical Condi- 
tioning, Instrumental Learning, and 
Variability. Certain ambiguous cases, 
such as avoidance learning, have 
simply been placed in the most con- 
venient category (in this case, that of 
Instrumental Learning). The cate- 
gory of Variability is intended to 
subsume studies of motivation and of 
alternation. 


PHYLUM: PLATYHELMINTHES 
Habituation 


The first study on flatworms 
demonstrating a behavioral modifica- 
tion which may be considered related 
to learning was that of Walter (1908), 
who reported several habituation 


NON 
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phenomena in planaria, Dugesia gono- 
cephala and Dugesia maculata. First, 
Walter found that a slight rotation 
of the aquarium in which the planaria 
were housed produced a momentary 
halt in the gliding of the animals. If 
he repeated the rotation at 1-second 
intervals, the halting diminished and 
did not reappear after about 12 trials. 
The response was, however, rein- 
stated if the aquarium was left sta- 
tionary for only a minute. Next, 
Walter placed planaria in a field of 
nondirective light which was of two 
different intensities and observed the 
number of “wigwag responses” 
(swinging, back-and-forth movements 
of the anterior part of the body) 
that the animals made when crossing 
the critical line between the two fields 
of light. These responses decreased 
as the number of crossings increased. 
Walter also found that planaria 
oriented themselves progressively 
faster to a directive light when the 
position of the light was suddenly 
changed. 

Dilk (1937) attempted to deter- 
mine the temporal relationships which 
would produce habituation to vibra- 
tion in planaria, D. gonocephala. 
Habituation occurred when vibration 
was presented at short intervals for a 
duration of 1 minute or more per 
presentation, but did not occur when 
the interstimulus interval was no 
less than 10-15 seconds and the 
duration of vibration was no greater 
than 1-2 seconds. 

Miller and Mahaffy (1930) con- 
ducted perhaps the only other pub- 
lished study on habituation, this on 
the trematode Cercaria hamata. A 
shadow cast on these animals elicits 
immediate swimming in most of 
them. After repeated shadowing at 
Short intervals (1-2 seconds), C. 
hamata no longer responds. But as 
in Walter's study, an interval of a 


minute was sufficient to erase the 
habituation completely. No decre- 
ment in swimming occurred when the 
animals were stimulated by repeated 
weak touches on the body rather 
than by shadows. 

Discussion. Systematic investiga- 
tions on habituation in flatworms 
have apparently never been per- 
formed to determine the limits of 
habituation, the effects of temporal 
and intensity parameters, the inter- 
action of different stimuli, or the 
duration of habituation under differ- 
ent conditions. The few data there 
are indicate that habituation to 
photic and mechanical stimuli does 
occur in flatworms, and that this 
habituation may be quite short-lived 
(on the order of seconds or minutes). 
Classical Conditioning 

To Van Oye (1920)! goes credit 
for the first attempt to produce in 
flatworms a more complex behavioral 
change than habituation. Van Oye's 
training procedure constitutes an ex- 
ample of instrumental learning and 
will be described in the next section. 
Subsequently to this study, Hovey 
(1929) demonstrated what he called 
“associative hysteresis” in flatworms. 
Actually Hovey's training operations 
amount to neither classical nor in- 
strumental conditioning, but to the 
reversal of an innate taxis. The poly- 
clad flatworm Leptoplana is usually 
quiescent in the dark and becomes 
active when exposed to light. Hovey 
left Leptoplana in the dark for 12 
hours, exposed them to light for 5 
minutes, placed them back in the 
dark again for 30 minutes, then ex- 
posed them to light for 5 more min- 
utes, and so on, until 25 exposures to 
light had been made. Each time the 
animals moved in the light, he 


3 We are grateful to Roman Kenk for call- 
ing this little-known report to our attention, 
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touched them on the snout with a 
small rod causing a momentary pause 
in their locomotion. Over the course 
of successive exposures, fewer and 
fewer touches were necessary to keep 
theanimals immobile. Control groups 
showed that this change was not 
attributable to sensory adaptation, 
motor fatigue, or injury to the snout. 
After a rest of 10 hours, Hovey (1929) 
reports, “‘many touches with the 
stick were necessary to renew the 
association" (p. 328). Yet Hovey's 
data show that there were consider- 
able savings: only one exposure 
period was required to reduce re- 
sponding to the level attained orig- 
inally after some 15 periods. 
Shortly after Hovey's work, Soest 
(1937) and Dilk (1937), working in 
the same laboratory, reported studies 
on avoidance training in flatworms. 
These experiments will be discussed 
in greater detail in the next section 
(Instrumental Learning) and are 
mentioned here primarily for the sake 
of historical continuity. Among his 
various avoidance training CS-UCS 
(conditioned stimulus-unconditioned 
stimulus) combinations, however, 
Dilk tested planaria, D. gonocephala, 
in one situation which qualifies as a 
classical conditioning paradigm. The 
response to light (CS) was observed 
to increase markedly in each animal 
when the light was followed after an 
interval of 2 seconds by vibration 
(UCS). As no control groups were 
run, the results cannot be considered 
as more than suggestive. That such a 
precaution is necessary was indicated 
by Sgonina's (1939) report that re- 
peated presentation of light alone, 
touch alone, or simultaneous presen- 
tation of the two, effected a period of 
sensitization to the light in planaria, 
Dugesia lugubris. 
The study of learning in flatworms 
was relatively neglected after the 


work of Van Oye, Hovey, Soest, and 
Dilk until, in 1955, Thompson and 
McConnell provided what appears to 
be the first controlled demonstration 
of classical conditioning in this 
phylum. Planaria, Dugesia doroto- 
cephala, were placed individually in a 
water-filled trough which could be 
illuminated nondirectionally from 
above—an improved version of the 
apparatus employed in this and sev- 
eral subsequent experiments has re- 
cently been described (McConnell, 
Cornwell, & Clay, 1960). A current 
could be passed through the water by 
means of electrodes embedded at 
either end of the trough. The planaria 
were exposed to 150 light-shock pair- 
ings massed in three blocks of 50 
trials each with an average intertrial 
interval of 20 seconds and an interval 
of 5 minutes between blocks. The 
light (CS) was on for 3 seconds—dur- 
ing the last second of which the shock 
was passed through the water. The 
shock (UCS) invariably produced an 
immediate longitudinal contraction 
of the animal; the light, after a brief 
habituation period, elicited no reli- 
able response rate in control animals. 
Over the series of light-shock pairings, 
the frequency of responses to the light 
(prior to the shock) showed a steady 
and significant increase, both in con- 
tractions and turns. Control groups 
exposed to repeated shocks, repeated 
lights, or neither shock nor light, all 
showed a slight decrease in respond- 
ing over trials, thus apparently 
eliminating sensitization or “spon- 
taneous” changes as possible ex- 
planations for the data. 

Cummings and Moreland (1959), 
however, have disputed the inter- 
pretation of these results as classical 
conditioning. They found in planaria 
that over a series of 250 pairings of 
vibration (CS) and shock (UCS), the 
number of responses to vibration in- 
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creased for about 150 trials, then 
dropped off sharply. In animals ex- 
posed to vibration alone, the same 
rise in responsivity occurred without 
the subsequent drop. The several 
procedural differences between this 
experiment and that of Thompson 
and McConnell, however, weaken the 
argument that the results of the two 
experiments are contradictory. 

In the same vein as Cummings and 
Moreland, Halas, James, and Stone 
(1961) contend that the results of 
Thompson and McConnell are at- 
tributable to the unconditioned elici- 
tation of responses by light. They 
exposed groups of planaria, Dugesia 
tigrina, to 150 light-shock pairings, as 
had Thompson and McConnell, and 
then exposed them individually to 25 
3-second intervals of light alone 16- 
22 hours later. The incidence of re- 
sponses to the light during these trials 
did not diverge significantly from that 
in other groups which had received 
150 light presentations, 150 shock 
presentations, or no-light and no- 
shock. In fact, none of these groups 
was distinguishable from any other, 
but all groups manifested a signifi- 
cantly higher frequency of response 
than a group that was merely ob- 
served for 2-second intervals without 
any illumination change during the 
25 test trials. These data suggest that 
(a) there was no sensitization by 
light- or shock-exposure, since these 
groups did not differ from the group 
that received neither light nor shock; 
and (b) light elicited responses from 
the worms, as evidenced by the dis- 
crepancy between the group that 
received light stimulation during the 
test trials and the group that did 
not. 

These results, interesting as they 
are, do not necessarily bear upon the 
Thompson and McConnell study,con- 
sidering the large and probably cru- 


cial disparities in design between the 
two. One wonders what would have 
happened, for example, had Halas 
et al. tested immediately after train- 
ing or continued testing long enough 
for habituation to have occurred. In 
this connection might be cited Harris’ 
(1943) discovery that differences be- 
tween the responses of groups of 
differently-treated rats may not be- 
come evident until relatively late in 
the extinction process. 

It is important to recognize that 
Thompson and McConnell (1955) 
and Halas et al. (1961) employed 
quite different criteria of learning: 
whereas the former experimenters 
recorded response changes during 
training, the latter group assessed 
resistance to extinction. Cornwell 
(1961) has succeeded in replicating 
both of the foregoing sets of results in 
the same experiment by observing 
both responses during training and 
responses during subsequent extinc- 
tion on the same planaria. Halas, 
James, and Knutson (1962) have re- 
cently replicated the Thompson and 
McConnell findings. Cornwell (1961) 
has argued cogently that the failure 
of Halas et al. (1961) to find differ- 
ences between groups in extinction 
performance may well have been due 
to dissipation of CS habituation dur- 
ing the interval between training and 
testing, and the consequent increased 
reactivity to light in the several 
groups. Such a change would tend to 
make the groups converge during ex- 
tinction. 

Baxter (1961) made a further at- 
tempt to distinguish between classi- 
cal and pseudoconditioning effects in 
the planarian, D. tigrina. The condi- 
tioning procedure in this experiment 
was modeled after that of McCon- 
nell, Jacobson, and Kimble (1959; de- 
scribed below), with 50 trials per day 
of light and shock stimuli being ad- 
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ministered. For the conditioning 
group, these presentations were 
paired in the usual overlapping fashion 
with a minimum intertrial interval of 
30 seconds. For the pseudocondition- 
ing group, light or shock was given 
every 30 seconds in random se- 
quence. Over the course of 250 trials, 
the conditioning group exhibited a 
steady increase in responses to the 
to the light, while in the pseudocon- 
ditioning group an equally steady de- 
crease occurred. The difference in 
performance between groups was 
highly significant (p «.001). The de- 
crease in the response rate of the con- 
trol group seems to reflect a habitua- 
tion process free of counteracting 
sensitization eflects. The two groups 
did not differ on a subsequent block 
of 50 extinction trials. Some impor- 
tant parametric information was also 
provided by this study: in both classi- 
cal and pseudoconditioning"groups, 
level of responding was directly re- 
lated to both UCS and CS intensity. 

Baxter's results are consonant with 
the argument advanced earlier about 
different measures of learning. What 
appear superficially to be incompati- 
ble results are thus in fact quite 
reconcilable. In this regard, two 
qualifications are warranted: 

1. Further data are necessary be- 
fore any secure conclusions can be 
drawn about differences or lack of 
differences in extinction between 
classically- and  pseudoconditioned 
planaria. In particular, the level of 
of conditioning should be carried fur- 
ther than has heretofore been the 
case, and attention should be paid to 
the CS habituation factor. 

2. Lack of correlation between 
measures of learning, a not uncom- 
mon finding in psychology (Kimble, 
1961), does not necessarily amount to 
a vitiating factor in studies of learn- 
ing. The experimenter is often re- 
quired to select the most appropriate 


criterion in a particular situation, and 
in the present case that criterion 
would seem to be acquisition rather 
than extinction. Hilgard and Mar- 
quis (1940) differentiate between 
classical and pseudoconditioning by 
requiring that in the former the 
response increment be “a function of 
the repetition of conditioned and un- 
conditioned stimuli in precise rela- 
tionship" (p. 42). By this standard 
definition, Baxter, Halas, Cornwell, 
and McConnell have convincingly 
shown that planaria can form clear- 
cut conditioned responses. 

In the past few years, several other 
experiments on classical conditioning 
in the planarian have exploited the 
animal's great regenerative ability to 
attempt to ascertain the "locus" of 
learning in this animal. McConnell, 
Jacobson, and Kimble (1959) em- 
ployed the light-shock paradigm of 
"Thompson and McConnell, using 
spaced training (50 trials a day) and 
establishing a definite criterion of 
conditioning (23 responses to the CS 
out of any 25 consecutive trials). 
Once a worm (D. dorotocephala) had 
reached criterion, it was immediately 
cut in half transversely and the halves 
isolated and allowed to regenerate. 
When regeneration was complete 
(about 4 weeks), all animals were re- 
trained to the original criterion. 
Whereas an average of 134 trials had 
been required for initial training, only 
40 and 43.2 trials were required for 
retraining in the original head- and 
tail-halves, respectively. These test- 
retest differences were highly signifi- 
cant. A trained but uncut group 
showed the same amount of savings 
as did the experimental animals. A 
group that was cut and then trained 
(after regeneration) actually took 
more trials to reach criterion than 
were required for the initial training 
of the experimental group, thus elimi- 
nating the possibility of sensitization 


LEARNING IN FLATWORMS AND ANNELIDS 


by the cutting and regeneration proc- 
ess. 

An independent replication of this 
experiment by Agoston (1960) yielded 
identical results. 

Following up these findings and 
using the same conditining arrange- 
ment, McConnell, R. Jacobson, and 
Maynard (1959) tested for regenera- 
tion in "totally reformed” worms. 
They reasoned that if worms totally 
reformed from a conditioned “an- 
cestor" showed savings, this would 
provide evidence for a chemical proc- 
ess involved in the regenerative trans- 
mission demonstrated by McConnell, 
Jacobson, and Kimble (1959). 
McConnell, R. Jacobson, and May- 
nard (1959) cut large planaria, D. 
dorotocephala or D. tigrina, in half pos- 
teriorly to the pharynx, conditioned 
the head end using relatively spaced 
trials, then cut off the regenerated 
tissue up to the pharnyx, and isolated 
the head end. After the head had re- 
generated a new tail they again cut 
the worm posteriorly to the pharynx 
and allowed both halves to regener- 
ate. Finally, they tested both of these 
latter groups. It will be noted that 
one of these (referred to hereafter as 
Group A) contained the head half of 
the original worm, whereas the other 
group (B) contained animals con- 
sisting completely of tissue reformed 
or regenerated subsequent to condi- 
tioning. Nonetheless, both Group A 
(X =56 trials to criterion) and Group 
B (X =122 trials) showed significant 
savings as compared with the original 
training (X =268 trials). The differ- 
ence in savings between Groups A 
and B was also significant. 

Ernhart (1961) employed a differ- 
ent technique of worm cutting to 
attack the question of whether two 
heads are really better than one. If 
the head of a planarian is split me- 
dially between the eyes and the halves 
kept apart during regeneration, a 
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worm with two complete heads re- 
sults. In conditioning (light-shock) 
these animals 3 weeks after surgery, 
Ernhart found that both the two- 
headed group and a “sham-operated” 
group took significantly longer to 
reach criterion than a normal group. 
When, however, an interval of 6 
weeks was interposed between sur- 
gery and training, the two-headed 
group conditioned significantly faster 
than either of the control groups. It 
may be, as Ernhart has suggested, 
that 3 weeks is an insufficient period 
for complete "behavioral" regenera- 
tion to occur; this would explain the 
discrepancy in these results. 
Returning to the locus of learning 
issue: the experiments of McConnell 
and his co-workers (1959) both indi- 
cated that whatever physiological 
changes may mediate conditioning in 
this animal must occur throughout 
the animal's body. A most intriguing 
study by Corning and John (1961) 
tested the hypothesis that ribonu- 
cleic acid (RNA) may somehow be 
involved in mediating learning or 
memory in the planarian. Previous 
theoretical speculation (Hydén, 1959) 
and certain empirical results with 
other organisms had suggested that 
this might be the case, and the rela- 
tive simplicity of the planarian 
afforded an opportunity for a clearer 
test than could be conducted with 
more complex animals. Corning and 
John (1961) permitted the halves of 
conditioned (light-shock) worms (D. 
dorotocephala) to regenerate in a w 
solution of ribonuclease (RNA-ASE), 
a compound which is destructive to 
RNA. They reasoned as follows: 
Conditioned tails, regenerating in the pres- 
ence of RNA-ASE, might be expected to pro- 
duce anterior portions with a depleted or al- 
tered RNA structure, perhaps due to influ- 
ences exerted at the regenerating interface. 
Such an organism must then have a naive 
dominant head. Conversely, since trained 
heads only have a non-dominant tail to re- 
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grow, they should demonstrate a greater 
degree of retention (pp. 1363-1364). 


The hypothesis appeared to be sup- 
ported by the results: the heads re- 
generated in RNA-ASE exhibited 
retention as great as that of control 
heads and tails regenerated in pond 
water, whereas the tails regenerated 
in RNA-ASE showed essentially no 
savings. 

A further attack upon the relation- 
ship between conditioning and bio- 
chemistry in the planarian has ex- 
ploited the tendency of hungry 
planaria to eat pieces of other pla- 
naria. Preliminary results from 
McConnell's laboratory (McConnell, 
1962; McConnell, Jacobson, & Hum- 
phreies, 1961) appear to indicate that 
such cannibal worms (D. tigrina), 
after ingesting pieces of conditioned 
(light-shock) worms, show a higher 
response rate during the initial stages 
of conditioning than do cannibals 
which have ingested naive planaria. 
A replication of this study conducted 
in John's laboratory has confirmed 
McConnell’s results, the experimental 
cannibals (D. tigrina) in John’s 
study showing a significantly higher 
response rate than cannibal controls 
on each of the first 8 days of condi- 
tioning (25 trials per day). 

Discussion. It would appear that 
classical conditioning, as distinct 
from pseudoconditioning, can be con- 
sidered a valid phenomenon in flat- 
worms (at least in planaria). The 
mechanism of regenerative transmis- 
sion of the conditioned response is at 
present a most exciting puzzle, de- 
serving of intensive investigation. 
This problem, essentially a physio- 
logical one, still merits much be- 
havioral study to determine the limits 
and conditions of such transmission. 
A start has been made in attacking the 


tE. R. John, personal communication, 
1961. 


basis of memory" in this animal ata 
more molecular level also. It may be 
that more than one "mechanism" 
will be required to account for the 
planarian's ability (a) to "store" a 
behavioral modification, and (b) to 
transmit this to succeeding asexual 
progeny. Much work is needed to 
resolve these questions. 


Instrumental Learning 


Van Oye (1920) succeeded in train- 
ing planaria to approach and obtain 
food by a novel and “unnatural” 
route. When food in introduced into 
their bowl, planaria receive chemical 
stimuli from it. The planaria crawl 
about on the bottom of the surface 
film of the water or on the bottom and 
sides of the bowl, making orienting 
movements, until they reach the food. 
When the food is attached to a piece 
of wire, and the latter suspended in 
the water such that the food is in the 
middle of the water, the only way 
that the planaria can reach the food is 
to crawl down the wire. This Van 
Oye's subjects did not do in several 
initial sessions, although they did 
manifest the characteristic ‘search- 
ing" behavior on introduction of the 
food into the water. Van Oye then 
initiated a “training” schedule which 
consisted of presenting the food on 
the wire at a progressively greater 
depth, beginning at the surface of the 
water. Eventually (after some 20 
Sessions), the worms consistently 
reached the food by way of the wire 
within 20 minutes. This response was 
maintained in a series of tests at 2- 
day intervals, but disappeared after a 
lapse of 2 months. Although no con- 
trol groups were run, it is reasonable 
to expect that the planaria would not 
have come to crawl down the wire 
had there been no food at the end of 
the wire during "training." 

Further evidence of instrumental 
learning capacities in flatworms did 
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not appear until, in 1937, both Soest 
and Dilk reported what may be re- 
garded as avoidance learning in flat- 
worms. Soest placed Stenostomum, 
a rhabdocoel, in a circular bowl half 
of which was weakly illuminated and 
the other half dark. Preliminary ob- 
servations showed that the animal 
spent approximately half of its time 
on each side and crossed the light-dark 
border without hesitation. Animals 
in one group were then shocked each 
time they attempted to cross into the 
lighted half. After some 10—20 trials, 
the animals began to turn back 
"spontaneously" into the dark as 
they came to the line, thus avoiding 
the shock. This response was then 
maintained with only occasional 
transgressions. Extinction was very 
rapid—about 5-10 trials (in as many 
minutes). A pseudoconditioning 
interpretation (sensitization to light 
via repeated shocks) could be made of 
this data were it not for the fact that 
Soest also trained another group to 
avoid the dark andremain in the light. 
As one would expect, this training 
was more difficult to accomplish— 
but it was in fact accomplished. It is 
still possible that the shocks sensi- 
tized the animals to the change of 
stimulation at the light-dark border. 
A control for this, which would in- 
volve repeated shocks administered 
without regard to the animal's loca- 
tion in the bowl, was not run. Soest's 
data reveal no effect of the shock on 
the animal's general activity level. 

Dilk (1937) performed a similar 
experiment on planaria (D. gono- 
cephala). Here too the bottom of the 
experimental bowl consisted of two 
(in some cases three) parts, different 
in any of several ways: rough versus 
smooth, wavy versus flat, light 
versus dark. In each case the animal 
received a vibration or shock stimulus 
when it crossed the border separating 
one cue area from another; and for 


each pair of stimuli the experiment 
was repeated twice on different 
groups, a different stimulus being 
positive each time. Like Soest, Dilk 
found it easier to train his animals to 
avoid light than to avoid dark, but he 
reports success both ways. He also 
reports success in training worms to 
avoid rough, but little success in 
training worms to avoid smooth. 
Training to wavy versus flat was 
successful only when a diffuse bright 
light was given simultaneously with 
vibration. On the whole, Dilk's ex- 
periment is open to the same criti- 
cism as Soest's, i.e., that he failed to 
employ controls for sensitization. 

Of late, interest in instrumental 
learning in planaria has centered on 
their ability to learn a simple maze. 
First Ernhart and Sherrick (1959) 
and more recently Best and Rubin- 
stein (1962) succeeded in establishing 
a maze habit in these beasts. Ernhart 
and Sherrick performed the instru- 
mental analogue of the McConnell, 
Jacobson, and Kimble (1959) study. 
Ernhart and Sherrick ran their ani- 
mals (D. Maculata — D. tigrina) to a 
criterion of three consecutive errorless 
trials in a two-unit water-filled T 
maze in which the goal box was 
darkened (planaria are negatively 
phototaxic). As each worm reached 
criterion, it was cut in half and al- 
lowed to regenerate, and then both 
halves were trained to criterion again 
in the same maze. Significant and 
equal savings were found in both re- 
generated heads and tails. These 
groups required slightly but signifi- 
cantly more trials on relearning than 
did a retention control group, which 
was trained but not cut. A control 
group for sensitization by cutting and 
regeneration took significantly more 
trials to attain criterion than did nor- 
mals in original acquisition, a rather 
curious result, but one which was also 
found by McConnell, Jacobson, and 
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Kimble (1959) and by Corning and 
John (1961). 

A detailed investigation of maze 
learning in planaria (Cura foremani, 
D. tigrina) by Best and Rubinstein 
(1962) employed several varieties of 
two-choice mazes. Withdrawal of 
water from the maze started the 
worms running and restoration of 
water served as reinforcement. One 
arm of the maze was lighted and the 
other dark; some worms were trained 
with light as the positive cue (leading 
to reinforcement), others with dark 
as the positive cue. Under these con- 
ditions, learning occurred in all the 
types of mazes used, as indicated by a 
significant increase in proportion of 
animals selecting the correct alterna- 
tive. Other possible explanations of 
the results were discussed and rather 
convincingly rejected. In the first 
two types of mazes that the experi- 

menters used, a significant and sharp 
reversal of the worm’s preference con- 
sistently followed the period of maxi- 
mal correct choices, and was in turn 
followed by a complete lethargy in 
the experimental apparatus, even 
when the water was withdrawn. No 
such reversal or decrement occurred 
when a third maze, which permitted 
the worm access to a larger chamber 
between trials, was employed and 
when in addition fewer trials per day 
were given. 

Best and Rubinstein also tested 
several worms on a conditional maze 
problem, in which the dark arm was 
correct when the walls of the maze 
were rough, the lighted arm being 
correct when the walls were smooth. 
Amazingly, one of the four worms 
tested seems to have mastered this 
problem, which is often difficult even 
for much “higher” animals. 

In apparent contrast with the 
studies by McConnell and associates 
(1959) and Ernhart and Sherrick 
(1959) is the casual mention by 


Angyan (1959) of some training work 
that he has conducted on planaria. 
He reversed their usual negative 
phototaxis by pairing light with food. 
After cutting and subsequent regen- 
eration, Angyan reports, the original 
head halves of these animals retained 
the light-seeking response while the 
original tail halves did not. Unfor- 
tunately, no further information on 
this provocative finding is available 
at present to the writer. 

Discussion. It is safe to say that 
planaria are capable of acquiring an 
instrumental as well as a classically- 
conditioned response. Indeed, it 
would be unparsimonious at present 
to assume that the two types of re- 
sponse involve different mechanisms 
in this animal. The findings of Best 
and Rubinstein (1962) on the condi- 
tional problem make one wonder just 
how complex a response planaria can 
learn. There appears to be no reason, 
now that fundamental learning ca- 
pacities have been demonstrated, that 
other learning principles can not be 
assayed in this phylum. For example, 
latent learning has seemingly been 
demonstrated in earthworms (Bha- 
rucha-Reid, 1956). Can it occur also 
in planaria? It would be most inter- 
esting indeed if such supposedly 
"cognitive" phenomena could be 
shown to occur in an organism of such 
physiological simplicity. More con- 
ventional paradigms remain unin- 
vestigated: it should be possible, for 
example, to test the animals in a con- 
ventional avoidance problem (e.g., of 
the Miller-Mowrer type), or on a 
discrimination reversal task. In addi- 
tion, the results of Angyan (1959) 
deserve careful scrutiny and replica- 
tion, since they stand in such star- 
tling contrast to studies which have 
found retention to be "distributed" 
throughout the whole organism, 
rather than only in the cephalic 
region as Angyan reported. 
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Variability 

Both paramecia (Lepley & Rice, 
1952) and meal worms (Grosslight & 
Ticknor, 1953) have been reported to 
exhibit alternation of turns in a T 
maze—although it may be noted that 
Jensen (1959) has presented an alter- 
native interpretation of these results. 
In a thorough attempt to test alterna- 
tion in planaria (D. dorotocephala), 
Rice and Lawless (1957) set up 15 
mazes consisting of various sequences 
of forced turns followed by ''choice" 
turns and differing also in distances 
between turns. Twenty-eight ani- 
mals were run for one trial each on 
one of the maze variations (i.e., total 
N=420). The essential hypothesis, 
that after a turn in one direction an 
animal would be more likely to turn 
in the opposite direction, was not sup- 
ported. In fact, a 50-50 distribution 
of responses was found under all 
conditions. It is possible that being 
placed in a new environment dis- 
turbed the worms sufficiently to mask 
any alternation tendencies that would 
have been demonstrated otherwise. 

Discussion. To date there has been 
no successful demonstration of spon- 
taneous alternation tendencies in flat- 
worms. Yet results with other or- 
ganisms, as well as the "'satiation" 
reactions observed by Best and 
Rubinstein (1962), suggest that this 
topic is deserving of further research. 
It should be determined, for instance, 
whether planaria would alternate on 
successive trials at the same choice- 
point. Moreover, it might prove 
interesting, should alternation be 
found, to attempt to discover the 
classes of variables (‘‘stimulus,” 
"place," "'response") with respect to 
which the animals will alternate. 


PHYLUM: ANNELIDA 
Habituation 


A number of studies have reported 
habituation in such various species of 
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sedentary polychaetes as sandworms 
and tubeworms. These animals livein 
tubes of their own making and have 
branchial filaments (tentacles) which 
they normally protrude from their 
tubes. A sudden decrease in illumina- 
tion elicits an immediate withdrawal 
of the filaments into the tube. Hesse 
(1899) and Bohn (1902) showed that 
repeated shadowing of Bispira voluti- 
cornis resulted in a decrement in the 
withdrawal reaction. A. W. Yerkes 
(1906) and Hargitt (1906) established 
that habituation to repeated shadow- 
ing occurs also in Hydroides dianthus. 
Yerkes found too that less habitua- 
tion occurred when the interstimulus 
interval was longer, and that no 
decrement in response occurred with 
repeated tactual stimulation. Hargitt 
reports that a swinging pendulum 
casting a shadow every 1-3 second 
soon lost its stimulative effect, but 
that this did not happen if the pendu- 
lum made only one transit per second. 
Other sedentary polychaetes in which 
habituation to sudden diminution in 
illumination has been found are 
Serpula vermicularis (Hess, 1914), 
Mercierella enigmatica (Rullier, 1948), 
and Branchiomma vesiculosum (Nicol, 
1950). Rullier also found habituation 
to mechanical shock, moving shadow, 
and a combination of the two. Nicol 
showed that the response to shadow is 
independent of habituation to light 
extinction, as evidenced by the fact 
that the former was not altered when 
preceded by the latter. 

Investigation of habituation in the 
“errant” (free-living) polychaetes 
seems to have been neglected until 
quite recently, when Clark (1960a, 
1960b) tested the reactions of Nereis 
pelagica to repeated stimulation of 
various types. These findings also 
emphasize the importance of using 
several parameters in habituation 
studies. JVereis pelagica was found to 
habituate rather rapidly to mechani- 
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cal shock, moving shadow, and de- 
creases or increases in light intensity. 
The rate of habituation to increased 
light was an inverse function of the 
interstimulus interval and an inverse 
U function of the duration of and total 
exposure to the stimulus. Persist- 
ence of the habituation was directly 
related to the interstimulus interval 
and the number of stimulations given. 
Some interesting relationships were 
found among the several stimuli: 
habituation to a moving shadow was 
independent of that to mechanical 
shock, but habituation to the shadow 
and to a decrease in illumination 
interacted in a complex fashion. 
Clark describes another type of inter- 
action effect which he labels “latent 
habituation”: repeated presentation 
of increments of light too small to 
evoke a noticeable reaction nonethe- 
less resulted in more rapid habitua- 
tion to large increments of light. 

Only one study on behavioral 
modification in the class Hirudinea 
has been located, that of Gee (1913) 
on the leech Dina microstoma. This 
animal responds to mechanical jarring 
of its bowl by swimming around and 
to shadows by contraction. Both of 
these responses habituated when the 
stimuli were repeated at 15-30 second 
intervals. The few data presented 
seem to indicate that there were some 
savings an hour later. 

Kuenzer (1958) has provided ex- 
tensive observations on the habitua- 
tion process in earthworms. Con- 
traction can be produced in these ani- 
mals by mechanical, thermal, or 
electrical stimuli. This contraction, 
Kuenzer reports, diminishes on re- 
peated presentation of the stimulus. 
Habituation via stimulation of one 
spot on a given segment extends to 
the other parts of that segment as well 
and to adjacent segments in dimin- 
ishing strength as the distance from 

the stimulated segment increases. 
Even within a given segment, how- 


ever, habituation to mechanical, ther- 
mal, and electrical stimuli are all 
independent of one another, except 
after long-continued stimulation (pre- 
sumably resulting in fatigue). De- 
pending on the strength of the stimu- 
lus, the response may be fully or 
only partially restored after 24 hours. 
Severing the ventral cord anteriorly 
to the clitellum (a conspicuous swell- 
ing on the body) produces enhance- 
ment of sensitivity posteriorly to the 
cut and a depression of sensitivity 
anteriorly to the cut, but severing the 
cord posteriorly to the clitellum re- 
sults in depression of response in both 
parts. 

Discussion. More is known about 
the parameters and course of habitua- 
tion in the annelids than in the flat- 
worms. The interactions of different 
stimuli have been charted to some 
extent in sedentary and errant poly- 
chaetes and in earthworms. The 
temporal relationships involved in 
habituation as well as such phe- 
nomena as “latent habituation” are 
suggestive of the similarities between 
habituation on the one hand and 
learning and extinction on the other. 
Kuenzer's (1958) work has provided 
some clues to the role of the nervous 
system in habituation. The latter 
line of investigation could be further 
developed with profit, as could the 
study of factors contributing to the 
duration of habituation. 

Classical Conditioning 

Copeland (1930) housed the poly- 
chaete Nereis virens in a water im- 
mersed glass tube, in which the worm 
remained, behaving just as though in 
a natural burrow. When food is pres- 
ent in the water, the normal undula- 
tory movements of the animal draw 
in the juices and the worm moves for- 
ward in the tube and seizes any food 
that is there. As Clark has shown 
(1960a, 1960b), a sudden change in 
illumination evokes a negative reac- 
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tion including withdrawal and cessa- 
tion of any ongoing undulatory ac- 
tivity. Copeland preceded the pres- 
entation of food for 15-20 seconds by 
an increase of illumination, and found 
that after only five trials, at the rate 
of two spaced trials per day, his one 
specimen of Nereis advanced to the 
front of the tube as soon as the light 
came on. This response persisted with 
few failures for as long as the animal 
was tested (about 50 trials). The 
same worm then required only four 
trials to advance when the signal was 
changed to “light off." And finally, 
Copeland showed that the animal was 
then capable of adjusting to repeated 
reversals in the cue (light on or light 
off) in only one trial. 

Copeland and Brown (1934) em- 
ployed the same species and the same 
paradigm, but substituted touch to 
the head region for light as the CS. 
Initially the experimenters rated 
each of three worms on reaction to 
the CS, using a 10-point scale ranging 
from very low (prompt and prolonged 
withdrawal) to very high (prompt and 
prolonged advancement). For one 
worm this preliminary testing lasted 
for 10 trials, for the second 20 trials, 
and for the third 50 trials. During the 
subsequent training (two trials per 
day), the CS was followed 2 minutes 
later by the presentation of food di- 
rectly to the worm. Again, the re- 
sponse to the CS during the 2 minutes 
following its administration was rated 
on each trial. It was found that for 
each worm, the responses to the touch 
became more “positive” as training 
progressed. Whereas at the end of 
preliminary testing the average rat- 
ing was less than 4, after 10 training 
trials it was up to 6, and after 40 
training trials up to 9. There was 
only negligible decrement at the end 
of rather long periods (21-75 days for 
different worms) of discontinuation of 
training. 

Copeland’s experiment is on the 


whole more convincing, even without 
control groups, than that of Copeland 
and Brown, for at least two reasons 
(accepting for the moment as equally 

the methods of evaluating 
responses in the two experiments): 
the former involved delayed condi- 
tioning and a 15-20 second CS-UCS 
interval, whereas the latter employed 
trace conditioning and a 2-minute 
CS-UCS interval It would seem 
from studies on other animals that 
conditioning under the latter circum» 
stances would be somewhat difficult 
to obtain, but Copeland and Brown 
found a rapid rise in the positiveness 
of response to the CS. It would be 
reassuring to see evidence (via control 
groups) in such studies that the 
contiguity of CS and UCS is a neces- 
sary factor for training to be success- 
ful. 

Such CS-UCS contiguity was found 
by Raabe (1939) to be a requirement 
for establishing conditioning in the 
aquatic — Oligochaet, 7 
variegatus. This animal, a close rela- 
tive of the earthworm, burrows in 
mud and often comes to rest with its 
hind end sticking out of the mud. 
Like the earthworm, Lumbriculus has 
light-sensitive cells distributed over 
its entire body. Shock, touch, and 
vibration elicit withdrawal of the 
hind end into the mud. Raabe found 
that by preceding any such UCS by 
an increment or decrement in illumi- 
nation, he could establish in some 
10-20 trials a conditioned withdrawal 
to the CS. It is significant that the 
CS-UCS interval was crucial in all 
these cases. The optimal interval be- 
tween onset of CS and onset of UCS 
was }-1 second; 3 seconds was a some- 
what less successful interval, and at 4 
seconds, or when CS and UCS were 
presented simultaneously, no condi- 
tioning could be established. This in 
itself suggests that the increase in 
response to the CS was not due to 
sensitization or pseudoconditioning, 
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but Raabe made further tests any- 
way which confirmed this. Another 
interesting finding in this study was 
that conditioning could still be ac- 
complished, although with more diffi- 
culty, after the head and frontal 12- 
31 segments of the worm were cut off. 
In all cases, conditioning involving a 
light-on cue was more successful than 
that involving light-off. This is 
easily understandable in view of the 
animal's normal preference for dark 
and tendency to avoid light. It will 
be recalled that Soest (1937) and Dilk 
(1937) explained their differential 
training successes by the same argu- 
ment. 

Further studies on classical condi- 
tioning in Oligochaets have been 
carried out by Ratner and Miller 
(1959a, 1959b). A bright light evokes 
rearing and withdrawal of the ante- 
rior segments in the earthworm 
Lumbricus terrestris. Ratner and 
Miller (1959a) paired this UCS with 
vibration as a CS (6 seconds of CS, 
during the last 2 seconds of which the 
UCS was also presented) for 100 
trials with a 50-second intertrial 
interval. Following this, 30 extinc- 
tion trials were given. These animals 
manifested a significant increase in 
their response to the CS during train- 
ing and a significant decrease during 
extinction. Three control groups were 
exposed (respectively) to: 100 trials 
of vibration without light, 100 4- 
second observation periods without 
either shock or light, and seven se- 
quences consisting of 10 consecutive 
2-second presentations of light fol- 
lowed by 5 6-second presentations of 
vibration. In sharp contrast to the 
experimental group, each of these 
control groups showed an initial rise 
in responding, then a slow decline. 
Neither sensitization nor "sponta- 
neous” processes, then, seems an ade- 
quate explanation of the response 
increase in the experimental group. 

Asecond study by the same experi- 
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menters (Ratner & Miller, 1959b) 
demonstrated that conditioning in 
the same situation reached a higher 
level and declined less after a 20- 
minute rest period in a group that 
received spaced trials than in a group 
that received massed trials. After 
removal of the pharyngeal ganglia 
("brain") however, conditioning 
could be established only with massed 
trials. Sensitization by the operation 
was ruled out by appropriate control 
groups. It is noteworthy that after a 
20-minute rest period, only the nor- 
mal spaced group performed at a level 
above that at the initiation of train- 
ing. A further point to be noted in 
the Ratner and Miller studies is that 
in neither experiment was a "true" 
sensitization control group run, i.e., 
one in which CS and UCS were given 
randomly. It will be recalled that 
Baxter (1961) was able to establish a 
clear difference between the per- 
formance of such a group and that of 
a conditioning group in planaria. 
The fact, however, that spaced train- 
ing was more effective than massed 
training would seem to argue against 
a pseudoconditioning interpretation 
in the Ratner and Miller studies. 
This datum probably makes it safe to 
conclude that Ratner and Miller have 
in fact demonstrated true classical 
conditioning in the earthworm. 
Bitterman (1960) mentions that 
pairing a neutral stimulus with shock 
results in a conditioned withdrawal 
reaction in the earthworm. Yet he 
has had difficulty in recording this 
CR on a kymograph, although, he 
says, the response is "clearly visible 
to the naked eye” (p. 711). 
Discussion. Classical conditioning 
has been demonstrated in several 
forms of annelids. Some of these re- 
sults are quite striking, such as the 
very rapid learning of Copeland's 
worms. The studies of Raabe and of 
Ratner and Miller concur in finding 
that removal of the brain does not 
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destroy the worms’ ability to form a 
classically conditioned response. 
Other such manipulations are poe 
sible and might prove most revealing. 
For instance, if the pharyngeal gang- 
lia were removed after conditioning, 
would the remainder of the worm 
retain the response? Would the worm 
show any savings when the new head 
had regenerated? Another unan- 
swered question is the failure of 
Ratner and Miller's “brain-less 
worms to condition with spaced 
trials, It may be that the ganglia are 
necessary for some functions (e.g., 
learning with spaced trials) but not 
for others (perhaps, the retention of 
such learning). 


Instrumental Learning 

The prototype of a series of maze 
studies on the earthworm was fur- 
nished by R. M. Yerkes (1912). This 
experiment, one of the most famed 
and often cited in comparative psy- 
chology, involved an N of exactly 
one. But Yerkes was able to demon- 
strate rather convincingly that this 
one Allolobophora foetida could learn to 
choose one arm of a T maze when that 
route led to a dark, earth-filled goal 
chamber and the other to electrodes 
which delivered a shock upon contact. 
When the experimenter cut off the 
first five segments of the worm, con- 
taining the pharyngeal ganglia, the 
worm continued to turn to the cor- 
rect side in the maze. It will be 
recalled that the same operation in 
Ratner and Miller’s study did not 
disrupt the truncated worm's ability 
to learn with massed trials. 

Heck (1920) successfully replicated 
the major findings of Yerkes’ study, 
using a similar apparatus and several 
varieties of earthworms (Lumbricus 
castaneus, Eisenia foetida, Allolobo- 
bhora caliginosa, A. longa, and A. 
chlorotica). Unfortunately Heck did 
not set a criterion, but it appears that 
his worms took from 100 to 200 trials 
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During training, the animal came to 
withdraw upon contact with the sand- 

paper. Robinson (1953) disputes the 
Yerkes interpretation that this stimu- 


relatively discrete phases in the learn- 
ing of a T maze by L. terrestris, repre- 
sented, respectively, by y^ a signif 
cant increase in time 
occurring around Trial 50, d Occa- 
sioned by negative reactions to al 
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Robinson may be reconcilable on the 
basis of species differences: Yerkes 
tested A. foetida, Robinson tested L. 
lerresiris. Schmidt found that the two 
types of worms exhibited different 
learning patterns, which substanti- 
ated both of the proposed explana- 
tions, respectively. That is, while 
both species satisfied the criterion of 
learning (10 consecutive errorless 
trials) in some 70 trials, L. terrestris 
gave generalized avoidance responses 
tothe whole maze, whereas the avoid- 
ance responses of A. foetida were 
restricted to the vicinity of the sand- 
paper. 

Bharucha-Reid (1956) found that 
earthworms (ZL. terrestris) confined to 
a T maze for 20 hours prior to the 
introduction of shock and a goal box 
reached a learning criterion in just 
half as many trials as animals that 
had no pretraining maze experience. 
Arbit (1961) felt that Bharucha- 
Reid's control group was not the most 
appropriate one for this experiment. 
In Arbit's study, the control group 
was exposed before training to the 
apparatus for the same amount of 
time as was the experimental group, 
but was confined to the stem. The 
two groups did not differ significantly 
in number of trials to criterion on sub- 
sequent training in the maze (re- 
ported $—.09; additional data ob- 
tained by Arbit more recently change 
this p value to .15).5 What difference 
there was favored the experimental 
group, but this small discrepancy too 
can be explained in terms of greater 
habituation of the experimental group 
(which was pre-exposed to conditions 
more nearly resembling those in the 
training situation). Unfortunately 
Arbit did not run a third group com- 
parable to Bharucha-Reid's control 
group, i.e., a group which received no 
pre-experimental exposure to the 
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maze. Nor can Arbit's experimental 
group be considered equivalent to 
Bharucha-Reid's, since the former 
received its pre-training exposure for 
2 hours a day on 10 consecutive days, 
while the latter remained in the ap- 
paratus for 20 consecutive hours im- 
mediately preceding training. This 
may help to account for the close-to- 
significant (p=.06) difference be- 
tween these groups in number of 
trials tocriterion (fewer for Bharucha- 
Reid's experimental group than for 
Arbit's). For further discussion of 
these latent learning experiments, see 
Bharucha-Reid (1961). 

Arbit (1957) has investigated an- 
other parameter of earthworm learn- 
ing:the level of activity of the animal. 
L. terrestris exhibits a diurnal cycle of 
activity which reaches its highest 
point in the evening and lowest in the 
morning (Baldwin, 1917). Similarly, 
a group of L. terrestris trained in the 
evening by Arbit was found to learn 
a Yerkes-type T maze significantly 
faster than a morning group. This 
may be strictly a perfofmance phe- 
nomenon, however, since the morning 
group worms also required signifi- 


cantly more touches per trial to keep . 


them moving through the maze. 
Krivanek (1956) has also .suc- 
ceeded in training L. terrestris to 
choose the no-shock arm of a T maze. 
He established in addition another 
type of position habit which is inter- 
esting but difficult to interpret: 
worms which were repeatedly ex- 
posed to strong light from one side 
while on a glass plate were found to 
turn to the other side in a T maze in 
which no directional light was pre- 
sented. Krivanek considers both this 
and the shock-avoidance case as ex- 
amples of pseudoconditioning (as 
defined by Prosser, 1950); but in 
considering them comparable, he dis- 
regards several important operational 
differences, primarily that in the 
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shock case the aversive stimulus is 
contingent on the animal's behavior, 
whereas in the light case this is not so. 
- A failure to demonstrate T maze 
learning in earthworms (4. terrestris 
longa, L. rubellus) led Fraser (1958) 
to question certain aspects of the 
methodology of several of the fore- 
going studies, especially (a) the fail- 
ure of the experimenter to establish 
pretraining turning preferences (and 
then to train to the nonpreferred 
side), and (b) the employment of 
weak criteria of learning. The ques- 
tion of preference establishment, 
while certainly a desirable methodo- 
logical control with any animal, was 
probably, as Fraser recognizes, not 
on any consequence in most cases— 
Heck (1920), for example, trained 
worms on reversals. The second 
criticism (that weak criteria were 
used in many studies), serious though 
it may be, should be recognized as a 
problem common to the whole area of 
learning research. Indeed, the fact 
that other factors (e.g., spontaneous 
alternation tendencies) do affect per- 
formance, as well as the obvious ad- 
vantage in comparing data, often 
make the establishment of an arbi- 
trary criterion not only desirable but 
a practical necessity. There are 
several ways in which one can make 
this practice acceptable: by the es- 
tablishment of a severe criterion, by 
the use of control groups, by reversal 
training, etc. It isa felicitous state of 
affairs when the results are so clear 
that no criterion is necessary (as in 
Heck's data), but unfortunately be- 
havior is seldom this uncomplicated. 

All of the studies discussed so far in 
this section have been concerned with 
teaching the worm to turn (so to 
speak) in a given direction. Presum- 
ably, the major cues in these studies 
were kinesthetic. Several studies 
have employed differential extero- 
ceptive cues in the maze training of 
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annelids. Unfortunately, none of 
these permits conclusions about the 
ability of the animals to learn to re- 
spond solely on the basis of the 
exteroceptive cues. 

In the first of these, Fischel (1933) 
trained Nereis virens to choose one 
side of a glass-tube Y maze, when that 
side led to darkness and the other to 
an actinia—a sea anemone which is 
apparently aversive to Nereis (the 
naturalistic approach is quite con- 
spicuous here!). In front of the "cor- 
rect" arm were small stones, and in 
front of the “incorrect” arm were fine 
pieces of shell. It is not surprising 
that when Fischel reversed positions 
of the cues after the worm was con- 
sistently choosing the correct path, 
the worm continued to turn in the 
same direction. Fischel thought that 
the cues may not have been discrim- 
inable, but it is clear that even had 
they been discriminable, the worm 
would not necessarily have used 
them as cues. It has often been found 
that an animal does not make use of 
all the available cues in a discrimina- 
tion problem. In order for a stimulus 
rather than a response habit to be 
established in the maze situation, the 
cues must be randomly assigned with 
regard to their position on a given 
trial. 

Wherry and Sanders (1941) fell into 
the same error in training L. terrestris 
to select the lighted rather than the 
dark side of a T maze. Worms were 
shocked if they advanced a certain 
distance into the darkened arm. 
Training was successful, but a subse- 
quent position reversal of the cues 
produced a temporary disruption of 
performance. It may be that both 
light and position cues were utilized 
by the worms, for the reversal was 
learned very rapidly after the initial 
disruption. 

Datta (in press) taught earthworms 
a position habit in a T maze, using 
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shock as punishment and return to 
the home container as reward. She 
reports that tactual differentiation of 
the floors of the maze-arms produced 
no better performance than occurred 
in the undifferentiated maze. Ap- 
parently, however, she did not at- 
tempt to ascertain whether the worms 
could learn to respond differentially 
to the tactual cues alone. The major 
concerns in Datta's research were 
distribution of practice, retention, 
habit reversal, and a comparison of 
two different indices of learning. The 
most important findings may be 
summarized as follows: (a) A 1- 
minute intertrial interval resulted in 
the same degree of learning as did a 5- 
minute interval, but with intertrial 
intervals of 25 or 125 minutes no 
learning was manifested when the in- 
dex of learning was the proportion of 
correct initial choices. When, how- 
ever, the index was írequency of 
repetitive errors, all groups showed 
about the same rate of improvement. 
(b) On both indices, some savings 
were found after 5 days of rest but 
no savings after 15 days of rest. 
(c) Progressive improvement oc- 
curred on successive position-rever- 
sals, but Datta reports that this im- 
provement seems to develop as a 
function of maze-training per se, 
rather than being dependent on ex- 
perience with reversal. (d) The two 
indices of learning employed (proba- 
bility of correct initial choice and 
change in the frequency of repetitive 
error) showed several interesting dif- 
ferences, which led Datta to con- 
clude that these two measures 
“might reflect the operation of differ- 
ent learning processes." 

Two further studies on annelids 
are not clearly cases of learning, but 
may be considered simple examples 
of “trial-and-error” behavior. Tey- 
rovsky (1922) placed barriers in dif- 
ferent positions around the tube 


worm Spirographis spallanzani, and 
found that after an initial period of 
"testing" the environment with its 
tentacles, the animal finally came to 
rest in a position that permitted max- 
imal extension of the tentacles. This 
behavior is clearly adaptive (in 
effect) in that it permits the most 
efficient food gathering. 

Málek (1927) observed that an 
earthworm (ZL. herculeus) finally 
abandoned the attempt to pull a fixed 
leaf into its burrow, after about a 
dozen attacks on the leaf. This may 
be similar to the extinction of an in- 
strumental response. 

Discussion. The body of work on 
annelid maze learning clearly sug- 
gests not only that earthworms can 
learn an instrumental response but 
that they exhibit other behaviors 
(spontaneous alternation, latent 
learning) once thought restricted to 
mammals. It need hardly be sug- 
gested that other such capacities 
should be sought. A desirable pro- 
cedure in future maze studies might 
be to use exteroceptive cues to indi- 
cate the correct path. Illumination 
and floor cues are the most likely 
candidates. The variation in position 
of the cues (and hence of the correct 
turn) might circumvent the problem 
of spontaneous alternation. Further 
information should be obtained too 
on the role of the nervous system in 
maze learning. Yerkes (1912) re- 
ported that as the new brain grew 
back in his decapitated worm, the 
worm progressively “forgot” the 
maze. Yet his data are inconclusive 
on this point, and no further tests of 
this have been conducted. If Yerkes’ 
conclusion is found to be true, what 
is the nature of this forgetting? 
Cutting off the new head and testing 
the worm could help to determine 
whether the entire nervous system 
had forgotten the habit or whether 
the new tissue had simply usurped 
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control of the animal's behavior from 
the other nervous elements. 
Variability 

Caldwell and Kailan (1955) set out 
to observe the effects of different in- 
tensities of light on the maze running 
of earthworms, A group subjected to 
an illumination intensity of 10 foot- 
candles in the maze proper and of 2 
foot-candles (as well as to moss and 
dirt) in the goal box was found to 
reach the goal faster and more fre- 
quently in 10 trials than did a group 
for which both maze and goal box 
were illuminated at an intensity of 5 
foot-candles, and for which there was 
no moss or dirt in the goal box. When 
the light intensities for both groups 
were then changed to 110 foot-can- 
dles in the maze and 5 foot-candles 
in the goal box (plus moss and dirt 
for both in the goal box), both groups 
performed at the previous level of the 
10:2 group in both time and fre- 
quency of success — Caldwell and 
Kailan (1955) conclude that “the 
introduction of intense light in the 
maze, also moss, earth, and reduction 
of light in the goal box, served as 
motivation and reward for these 
organisms” (p. 143). It appears, how- 
ever, that these several variables 
were manipulated in far too con- 
founded a fashion for any conclusions 
to be permissible as to what was pro- 
ducing the observed differences in 
performance. 

Several studies have investigated 
spontaneous alternation phenomena 
in Lumbricus terrestris. Wayner and 
Zellner (1958) found that practically 
all (24 out of 29) of their animals 
alternated turns in a T maze signifi- 
cantly more than chance expectation. 
Removal of the pharyngeal ganglia, 
however, resulted in significantly 
less-than-chance alternation in most 
worms, and a decrement in amount of 
alternation for all. No decrement 


two effects correlated .85. A final 
control showed that alternation oc- 
curred in only two out of 29 animals 
when the intertrial interval was 24 
hours. Kasper (1961) too found 
greater-than-chance alternation both 
in a T maze and in a maze consisting 
of a forced turn followed by a 
"choice" turn. The latter result is 
particularly interesting, even though 
only one worm was tested, in that 
rats have been reported not to alter- 
nate (from forced turn to choice 
turn) in an apparatus of the same 
pattern (Estes & Schoefiler, 1955). 
Arbit and McLean (1959) had no 
such success in their test of alterna- 
tion in Lumbricus. Worms were 
allowed to choose any of four turns 
in a fan-shaped maze (the turns: 
90-degree left, 30-degree left, 30- 
degree right, 90-degree right). Prior 
to this testing, the worms were 
trained equally on each of the four 
paths and then, just before the critical 
choice, were given 10 forced trials on 
either the 90-degree left or the 90- 
degree right turn. Arbit and McLean 
report that tendencies to alternate on 
the critical trial were not evidenced. 
In attempting to account for the dis- 
crepancy between these results and 
those of Wayner and Zellner (1958), 
one should note the rather substantial 
procedural differences. In oTa 
Arbit and McLean's use of the four- 
choice situation and of pretest train- 
ing deviate from the usual concep- - 
tion of a spontaneous alternation 
test. ich 
Fraser (1958) found, interestingly 
enough, that while the earthworm AS CE 
terrestris longa alternated turns in a T 
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maze significantly more often than 
the chance expectation, the earth- 
worm L. rubellus alternated signifi- 
cantly /ess often than chance. Fraser 
notes that differential tendencies of 
this sort may help to account for dis- 
crepancies in learning experiments in 
earthworms. 

Discussion. As was seen to be the 
case for planaria, tests of alternation 
in earthworms have not attempted to 
separate the several sources con- 
tributing to alternation. In the re- 
search thus far, response factors 
would appear to have influenced the 
results most strongly. It would be 
interesting to ascertain whether 
earthworms would alternate with re- 
spect to exteroceptive stimuli alone. 
Previous work clearly suggests that 
a simple T maze is the most appro- 
priate instrument for such tests. The 
question of species differences in 
alternation tendencies also deserves 
further research. 


CONCLUSIONS 


This review has been quite atten- 
tive to problems of methodology, 
since earlier work was often so ill- 
controlled and since the area of learn- 
ing in lower organisms is even now 
beset in other quarters by methodo- 
logical difficulties (Jensen, 1957). It 
is apparent that insofar as the phyla 
Platyhelminthes and Annelida are 
concerned, questions of method and 
control need no longer deter research. 
The convincing demonstration of 
learning capacities in these physio- 
logically-interesting animals has 
cleared the way for the investigation 
of a number of problems. 

Among the most intriguing of these 
problems is the limits of the behav- 
ioral complexity of worms. The re- 
sults so far have paralleled to a sur- 
prising degree the phenomena found 
in “higher” animals, and one may 
venture to predict that exploitation 


of this question will be limited in the 
future as in the past more by the 
imagination and prejudices of the 
experimenters than by the capacities 
of the animals. Only recently has it 
been generally acknowledged that 
mammals other than primates are 
more than drive-reducing machines. 
Perhaps the recognition of compar- 
able talents in lower organisms will 
not be so slow in coming. 

A promising start has been made in 
studying the physiological substrate 
of learning in flatworms and annelids. 
Enough has been accomplished in this 
regard to make it clear that future 
investigations are likely to provide a 
major break-through in this crucial 
puzzle. Particularly encouraging is 
the growing interest of zoologists and 
biochemists in worm learning and the 
resulting trend toward an interdis- 
ciplinary approach to the problem. 
It is likewise this writer's conviction 
that psychologists who espouse the 
comparative approach will recognize 
that the study of lower organisms 
may yield many important contribu- 
tions to our knowledge of learning. 
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ERRATA 


In the article Familial Concordance by Sex with Respect to Schizo- 
phrenia" by D. Rosenthal (Psychol. Bull., 1962, 59, 401-421) the column 
headings Male and Female should be reversed in Table 1, page 403. 


In the article entitled “Color Vision Research and the Trichromatic 
Theory: A Historical Review" by S. Balaraman (Psychol. Bull., 1962, 59, 
434—448) line 21, page 435 should read: 
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Theory and research are reviewed and integrated with sociological data 
on mate selection and marital happiness, Topics included are homog- 
amy, interpersonal perception, identifications, complimentary needs, 
and role theory. Modal marriage-role expectations exist; they are dif- 
ferent for the sexes, and are established by parental identifications. The " 
husband role is more instrumental; the wife role more integrative, For 
the marital happiness of both spouses, the husband's role performance is 
more crucial than that of the wife. For all marriages—both modal and 
unique—happiness is a function of the satisfaction of those needs and 
expectations which, for each individual, are specific to marriage. Thus, 
investigation of role-specific need dispositions is preferable to conven- 
tional approaches which deal with general personality needs. 


After 70 years of research, the 
broad outlines of a systematic social 
science approach to marriage may be 
discerned. Both psychology and 
sociology have made extensive ex- 
plorations. Before us now is the task 
of integration, which should map the 
work of both disciplines into ap- 
propriate relations with one another. 
Like early cartographers, we shall 
err; but a solid ground of data exists, 
and can be distinguished from un- 
known seas. 

Marriage research began in the 
1890s with Pearson's comparisons of 
the anthropometric characteristics of 
spouses. From that time until our 
own, the organizing issue in all mating 
research has remained the same, 
namely, the degree of similarity be- 
tween husbands and wives. That is, 
do “likes marry likes” (homogamy), 

ì The major portion of this report. was 
written at the University of Michigan and 
Supported by a United States Public Health 
Service fellowship. 
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to E. Lowell Kelly for his direction and en- 
couragement, 


or do "unlikes" marry (heterogamy)? 

Sociology has produced convincing 
evidence for homogamy of several 
cultural variables. ^ Hollingshead 
(1950) has provided both an excellent 
bibliography and a definitive piece of 
research demonstrating homogamy 
with respect to race, age, religion, 
ethnic origin, and social class. More 
recently, residential propinquity has 
been added to sociological variables 
influencing mate selection; Katz and 
Hill (1958) provide a bibliography, a 
review, and an integration. These 
factors, then, largely define that pool 
of opposite-sex individuals which one 
is most likely to meet and know; it 
may be called the "field of eligibles” 
(Winch, 1952). 

Yet it is obvious that the individual 
psychology must be accountable to 
some degree for the “field of àt- 
quaintances" frem which the mate 
must ultimately be selected. Psy- 
chological factors must affect the — 
limits of the field, and most certainly 
selection from within those limits. It 
is perhaps such considerations that 
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have led sociologists to extend their 
investigations to psychological factors 
affecting mate selection and marriage 
outcome. 

Such investigations began in the 
1920s. The pioneering work of 
Burgess and Cottrell, King, Locke, 
Terman, Kirkpatrick, and others has 
been presented and summarized in 
Burgess and Wallin’s important book, 
Engagement and Marriage (1953), 
which also reports the results of their 
own study of 1,000 engaged and 666 
married couples. 

In all these early studies, homog- 
amy—not heterogamy—is the trend, 
though relationships are of a low 
order among psychological variables 
—much lower than for the investi- 
gated cultural characteristics and 
social traits. For example, Burgess 
and Wallin reported that of the 42 
items of the Thurstone Neurotic 
Inventory, 14 showed a greater than 
chance expectation for homogamy of 
engaged couples. None were heterog- 
amous. The significant relationships 
ranged (in ratio of obtained to ex- 
pected similarity) from 1.17 on “do 
you day-dream frequently?” to 1.04 
on "when you were in school did you 
hesitate to volunteer in a class recita- 
tion?" Comparable results are re- 
ported for items on the Bernreuter 
Personality Inventory and the Strong 
Interest Test by Terman (1938). 
Homogamy, then, obtains in assorta- 
tive mating. 

Marital success was at that time 
(and remains) the second outcome 
variable of interest to researchers, 
Generalizing from the studies of 
Terman (1938), Terman and Oden 
(1947), and Burgess and Wallin 
(1953, p. 529), the latter authors 
present the following lists of char- 
acteristics as the most decisive in 
differentiating happy from unhappy 
marriages: 


UNHAPPILY 
MARRIED 


HAPPILY 
MARRIED 


Emotionally stable Emotionally unstable 
Considerate of others Critical of others 


Yielding Dominating 
Companionable Isolated 
Selí-confident Lacking 
self-confidence 
Emotionally Emotionally 
dependent self-sufficient 


Employing the Thurstone items 
(obtained before marriage) weighted 
for maximum discrimination, Burgess 
and Wallin report correlations with 
marital success scores of .25 for men 
and .18 for women. Bernreuter re- 
sponses, after marriage, provide suc- 
cess correlations of .38 and .42 for the 
sexes, respectively, according to Ter- 
man, The Burgess and Wallin results 
were substantially replicated more 
than 20 years later on a grossly differ- 
ent sample by Burchinal, Hawkes, 
and Gardner (1957). That indi- 
viduals' neurotic traits are predictive 
of marital disharmony can be ac- 
cepted as a demonstrated fact. 

The generalization "homogamy- 
with-respect-to-personality-traits" is 
drawn by all the classic investigators. 
It should be remembered, however, 
that most traits investigated are 
neurotic in character. That neurotics 
unite in marriage with neurotics is an 
observation common in psychoana- 
lytic literature. In the light of our 
present knowledge of the relation- 
ships between culture and personal- 
ity, homogamy of the degree reported 
with respect to social interests and 
general personality traits could likely 
be accounted for on the basis of the 
common modal personalities of in- 
dividuals in common cultural groups; 
particularly when it is known that 
these cultural similarities establish 
the marital field of eligibles. The 
effect of degrees of homogamy or 
heterogamy on marital success has 
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not been assessed, beyond the fact 
that individuals possessing those 
traits listed in the right-hand column 
above are more likely to be unhap- 
pily married, and are likely to be 
married homogamously, and are 
thereby doubly damned. 

The main body of this review, then, 
is concerned with studies having the 
the above-summarized information 
as background. More recent research 
can conveniently be divided into four 
. somewhat overlapping areas: inter- 
personal perception, identification, 
complementary needs, and role 
theory. 


INTERPERSONAL PERCEPTION 


Perception of the self and of others 
has lately been a central construct in 
influential theories and research of 
personality and personality change 
(Rogers & Dymond, 1954). Although 
the classic studies discussed above 
have used selí-ratings and ratings-by- 
others as techniques in marriage re- 
search, Kelly (1941) was the first to 
consider perception of personality as 
an operative force in its own right: 
"the actual relative position of the 
husband and wife on a personality 
trait continuum are not as important 
in determining compatability as the 
belief of the husband and wife regard- 
ing their relative positions on these 
scales" (p. 193). The instrument 
used to investigate this proposition 
was Kelly's 36-item Personality Rat- 
ing Scale, administered for self- 
perception and perception of spouse 
to 76 couples. His results may be 
summarized as follows: subjects rate 
themselves less favorably than they 
rate their spouses, and less favorably 
than they are rated by their spouses. 
The Burgess-Terman-Miles Compati- 
bility Index was also administered to 
each subject, yielding the following 
information: high compatability is 
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associated with more favorable selí- 
ratings, but accompanied by spouse 
ratings which are yet more favorable, 
These findings hold true for both 
husband and wife. Kelly concludes 
that an individual's personal satis- 
faction in marriage is related both to 
self-regard and to the judgment of 
the self's inferiority or superiority 
vis-a-vis the spouse. 

Preston, Peltz, Mudd, and Froscher 
(1952) extended this type of investi- 
gation to the consideration of the 
relationship between person-percep- 
tion and an objective appraisal of that 
person. Couples drawn from the 
clients of the Marriage Council of 
Philadelphia constituted the sample. 
Fifty-five couples had received pre- 
marital counseling; 116 had received 
postmarital counseling. The two 
groups can be accepted as more- and 
less-happily married subjects. Using 
a personality rating scale of 17 items 
—selected from those used by Kelly 
(1941) and Burgess and Cottrell 
(1939) — Kelly's results were substan- 
tially verified, except that the less- 
happily married men judged their 
wives much more severely than them- 
selves. This discrepancy seems in 
principle to conform with Kelly's 
formulation. The difference in range 
of happiness to be expected between 
the samples of the two studies would 
seem to account for this disparate 
finding. 


Further results are as follows: 


1. Self-ratings of spouses show 
positive correlations of the same 
order as those of the classic studies 
with a tendency for greater con- 
gruence in happier than in unhappy 
couples. (Medians=.19 and .30, re- 
spectively.) 

2. Higher correlations occur, how- 
ever, between ratings-of-self and 
ratings-of-spouse. This tendency is 
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likewise stronger with more happily- 
marrieds. 

Concerning the question of objec- 
tivity of perception, Preston et al. 
(1952) comment as follows: 


The correlations between the self-ratings of 
the spouses are uniformly much less than the 
correlations between the ratings of self and 
partner no matter which spouse is studied. 
Further, the data of the experiment indicate 
conclusively that the happily married group 
exhibit a larger discrepancy between the rele- 
vant correlation coefficients, From these two 
facts the conclusion is inescapable that the 
happily married groups show more evidence 
of lack of realism in their personality ap- 
yt than the unhappily married group 
p. X 


This conclusion becomes quite 
escapable when one realizes that the 
self as seen by the self, and the self as 
seen by the spouse, necessarily con- 
stitute different stimulus patterns; 
there is no reason to expect total 
agreement. Further, it is somewhat 
risky to invoke “realism” as a con- 
sideration when none of the variables 
concerned are externally validated. 
Rather, these data indicate a per- 
ceived similarity of self and spouse as 
they interact, such similarity increas- 
ing with marital happiness. 

Dymond’s (1954) data seem to sup- 
port this view. She concludes: “Mar- 
ried love is not blind... the better 
each partner understands the other's 
perceptions of himself and his world, 
the more satisfactory the relation- 
ship" (p. 171). Her subjects were 15 
couples well known to her, with a 
mean length of marriage of 10.4 years. 
One hundred MMPI items, pertain- 
ing to interaction with others, were 
administered to each of the 30 sub- 
jects. After answering for the self, 
each subject predicted the spouse's 
answers. In order to control for 
stereotypy of reply, all items which 
were answered uniformly by more 
than two-thirds of the group were 


ROLAND G. THARP 


eliminated, leaving 55 items exhibit- 
ing a reasonable degree oí difference. 
Since the yes-or-no probabilities of 
these items were roughly equal, 
predictive ability (“‘understanding"’) 
would be uncontaminated by knowl- 
edge of group norms. Scores were 
then related to the happiness of the 
marriage, as rated by the subjects 
themselves and validated by 
Dymond's rating. The usual finding 
occurred: happily married spouses 
resembled each other more than un- 
happily marrieds. Dymond's prin- 
cipal hypothesis was verified also; 
happys predict spouse replies signifi- 
cantly better than do unhappys. 
Further, there is significantly less 
association between similarity of self- 
spouse and accuracy of prediction in 
the happy than in the unhappy group. 

It can be seen from the foregoing 
studies that with increases in self- 
similarity, increases of perceived self- 
similarity and increases in predictive 
ability, happiness is greater. But all 
research indicates that, presumably 
due to patterns of assortative mating, 
the two selves of the partners— 
happy or no—exhibit similarity. The 
inference seems, therefore, that hap- 
piness increases as does congruence 
between self-as-self and self-as-spouse. 
Put differently, when the self as seen 
by the self and the self as seen by the 
spouse become more nearly equal 
stimulus configurations; that is, when 
the self, acting as spouse, does no 
violence to self-identity, then, either 
causatively or concomitantly, happi- 
nessincreases. Considerations such as 
these will be expounded more fully 
under the section on role theory 
below. 

Corsini's (1956a, 1956b) important 
and startling results allow further 
generalizations. Twenty volunteer 
students and their spouses, from the 
University of Chicago, participated. 
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Marital happiness was assessed by 
the Burgess- Wallin scale. A SO-item 
adjective Q sort was sorted four times 
by each subject: (a) for self, (b) for 
spouse, (c) prediction for spouse, and 
--adding a new dimension to previ- 
ous research—(d) prediction: of the 
spouse's description of the subject. 
A long-overdue experimental control 
was instituted by Corsini: every con- 
clusion with respect to couples was 
checked by drawing random samples 
of noncouples, and the same opera- 
tions for couples duplicated. Follow- 
ing previous investigators, Corsini 
agrees that: (a) understanding the 
mate is not related to similarity of 
self and mate, and (b) happiness is 
associated with similarity of self- 
perceptions. 

However, Corsini (1956a, 1956b) 
discovered that although understand- 
ing can be shown to exist between 
husbands and wives, this under- 
standing is related to marital happi- 
ness only in those comparisons when 
the husband is the target of Q sorts 
(that is, wife's prediction X husband's 
self-perception; and husband's pre- 
diction X wife's perception of him). 
In these instances, husband-wife 
correlations vary positively with 
marital happiness for both mates. 
This strongly suggests that the hus- 
band's role in marriage is the crucial 
one for the satisfaction of both 
partners. However, the above-stated 
relationship was then shown by Cor- 
sini to be no more true for husband 
and wife than for randomly-paired 
men and women who did not even 
know each other! This led him to sug- 
gest that the relevant relationship 
may exist between marital happiness 
and a stereotyped conception of the 
husband. He then demonstrated 
that the greater "conformity" of 
male self-perception (measured by 
the mean correlation for each male 
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against all other males) is e 
correlated with happiness for - 
husband and oT — of these 
relationships when perceptions 
of the female is the variable con- 
sidered. 

It seems, therefore, that our prior 
generalization can be expanded. The 
congruence, necessary for happiness, 
between self-perception and 
tion by the spouse is par 
crucial for the male; further, this 
agreement as to male-as-husband 
most often partakes liberally of 
widely-shared expectations of hus- 
bandly 

Luckey (1959, 1960a, 1960b), in 
her careful and impressive study, 
contributes to this emerging formula- 
tion. Eighty-one couples, all of some 
education at the University of Minne- 
sota, were selected from a much 
larger subject-pool in order to pro- 
vide two groups highly differentiated 
on the Locke and Terman marital 
happiness scales. The Leary Inter- 
personal Check List (ICL) was com- 
pleted by each subject for self, spouse, 
ideal self, mother, and father. Con- 
gruence or divergence between a 
respondent and these “significant 
others" could be estimated on each of 
four scales provided by the ICL. 
Luckey's results support Corsini's. 
Satisfaction in marriage is related to 
the congruence of the husband's self- 
concept and that held of him by the 
wife. The relation does not hold for 
concepts of wives. Happiness is also 
related to (a) congruence of the hus- 
band's self and ideal concepts, (b) 
congruence of husband's self-concept 
and his concept of his father, and (c) 
congruence of the wives' concepts of 
their husbands and concepts of their 
fathers. t 

It seems, therefore, that the max- 
imally happy marital situation can 
be described as follows: husband and 


102 ROLAND G. THARP 


wife agree that he is as he wishes to 
be, namely, like his father; and as she 
wishes him to be, namely, like her's. 
Surely this broad area of agreement 
is the culturally defined male sex-role 
—more specifically, the male subrole 
of husband. 


IDENTIFICATION 


The mechanism whereby appropri- 
ate sex-typical behaviors are trans- 
mitted from one generation to an- 
other has long been labeled “‘identifi- 
cation." The psychoanalytic account 
of the process involved is the most 
elaborate: the boy renounces a direct 
libidinal claim upon the mother in 
favor of vicarious gratification 
through the father, with whom he 
thus "identifies"; thereby establish- 
ing congruent values and behaviors 
between boy and specific father, and 
also between the boy and the general 
male gender. The process for the girl 
is held to be similar, though more 
gradual, and culminating not in a 
preschool climax, but in a diffused 
struggling until late adolescence or 
early marriage, when the female 
identity crisis must be met. 

In any case, the child renounces 
strong libidinal cathexes upon the 
opposite-sex parent. The obvious 
inference for mate selection has been 
repeatedly drawn: the courtship quest 
is for the opposite-sex parent image 
(Dreikurs, 1930; Fluegel, 1926; Ham- 
ilton & McGowan, 1930). Sporadic 
and generally unsuccessful efforts to 
test this hypothesis have been made. 
Hamilton and McGowan (1930) re- 
ported that only 17% of men studied 
did marry women bearing physical 
resemblances to their mothers. Of 
these men, however, 94% were 
happy, whereas only 33% of the men 
were happy when mates did not 
resemble mothers. A similar, though 
only slight, relationship held between 


happiness and wife's similarity to 
mother's temperament. 

If men marry mother's images, 
would not sons of younger mothers 
marry younger wonien than sons of 
older mothers? Commins (1932), 
using 1,075 subjects of the English 
Who's Who, reported statistically 
significant younger age at marriage 
for oldest sons as compared to other- 
than-oldest sons. Kirkpatrick (1937), 
using 768 cases from the Compendium 
of American Genealogy, found no rela- 
tionship between sibling positio: and 
mean age at marriage. M:ngus 
(1936), using 600 college women as 
subjects, found that, on matters of 
interests and personality traits, wom- 
en rate their ideal-as-husband more 
similarly to their current most inti- 
mate male companion than to their 
fathers. We may conclude, with 
Sears (1942), that thereare as yet no 
statistical investigations which are 
adequate for purposes of verifying the 
mate-opposite sex parent resem- 
blances notion. 

The more recent investigation by 
Strauss (1946a, 1946b), though, did 
give new life to the issue. A group of 
373 engaged, informally engaged, or 
recently married persons (200 women, 
173 men) participated. Strauss re- 
ports greater resemblances between 
men’s mothers and mates than be- 
tween women’s fathers and mates, 
but this information was garnered by 
simply asking the subjects how much 
resemblance existed—''very much” 
to "not at all." Responses less sub- 
ject to bias, fortunately, were ob- 
tained on 25 personality traits, rated 
by each subject separately for self, 
mate, and parents. These data give 
evidence for something more than 
chance congruence between personali- 
ties of mate and parent, but not 
necessarily of the opposite sex parent. 
On the basis of interviews conducted 
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with some female subjects, Strauss 
suggests that childhood affectional 
experiences with parents are linked 
with adult love choices, 

The precise nature of this link 
seems to be the processes of identifi- 
cation. The Burgess-Wallin (1953) 
data, studied by Lu (1952c), indicate 
that parental authority-domination, 
as reported by the offspring, is posi- 
tively related to childhood conflict 
with the parent, and negatively re- 
lated to adult attachment to that 
parent, irrespective of the sex of 
parent or child. These conclusions 
were based on the several items in the 
Burgess-Wallin questionnaire which 
bore face validity to the dimensions 
investigated. This limitation, plus 
the evident opportunity for the sub- 
jects to respond with halo, would lead 
a reader to withhold judgment on 
Lu's hypotheses. However, precisely 
this relationship is being demon- 
strated in current developmental- 
longitudinal studies of identification 
processes (Kagan, 1958; Payne & 
Mussen, 1956). Apparently it is 
affectional bonds which leads the boy 
to identify with the father, not fear of 
his castrating ire. 

Earlier, we proposed that identifi- 
cation with the father leads to hap- 
pier marriage. Assuming that early 
affectional relationships with the 
father lead to stronger identification, 
we would expect that such an affec- 
tional relationship would affect mari- 
tal happiness. 

Luckey's results are pertinent here 
(Luckey, 1960a, 1960b). In the un- 
happy marriages which she studied, 
men saw their fathers as more domi- 
nant and less loving than themselves 
on each of the ICL scales, Lu's fur- 
ther work suggests one of the conse- 
quences of conflict with parents (Lu, 
1952b). By the use of a 16-item 
dominance-submission scale, he di- 
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vided marriages into husband-domi- 
nant, equalitarian,and wife-dominant 
groups. Dominant roles are associ- 
ated with conflict with parents, 
equalitarian roles associated with 
affectional attachment to parents. 
Further, there is good evidence for 
equalitarian roles’ positive associa- 
tion with marital adjustment, and 
dominant roles’, by either spouse, 
negative association with marital 
adjustment (Lu, 1952a). 

Obviously, so few studies have been 
done in this area that only the most 
tentative general hypotheses can be 
extracted. But it does not seem 
untoward to propose the following. 
Solid affectional father-son bonds 
lead to the adoption, by the youth, 
of the ways of the male, This allows 
him, as husband, to be thoroughly 
himself while enacting the expected 
male role as husband. This satisfac- 
tory performance of husband role 
satisfies the expectations of the wife; 
the husband too is happy, for the 
self-as-self and the self-as-spouse 
produce no conflict. Under such 
circumstances, no submissive or com- 
pensating-dominating patterns of re- 
lationship need be instituted. The 
possible permutations of the few 
variables used here lead to a myriad 
of predictions, none of which could be 
checked by data now available. 
(Though it should be mentioned that 
a pattern of early affectional relation- 
ships leading to a predominantly 
cross-sexual identification—at least 
in the male—would be expected to 
lead to results opposite to those out- 
lined above.) 


CoMPLEMENTARY NEEDS 


A new and vigorous dissident 
entered the homogamy-heterogamy 
issue in the person of R. F. Winch, 
who with his associates has elaborated 
the theory of complementary needs. 
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Briefly stated, the theory holds that 
though homogamy of social char- 
acteristics establishes a "field of 
eligibles," mate selection within this 
field is determined by a specific kind 
of heterogamy of motives—comple- 
mentarity. This complementarity 
may be of two kinds: (a) thatin which 
partners differ in degree of the same 
need, or (b) differ in kind of need. 
That mate is selected who offers the 
greatest probability of providing 
maximum need satisfaction, as the 
partners act according to their com- 
plementary pattern of motives: 


So that if individuals A and B have comple- 
mentary need patterns, B's resulting behavior 
will be a greater source of gratification to A 
than will be the case with the behavior of C, 
who is psychically similar to A (Winch, 
Ktsanes, & Ktsanes, 1954, p. 242). 


In marriage research, no other 
hypothesis produced in the last dec- 
ade has been as influential: 


(Winch's) work represents a valuable entree 
to an extremely complex and subtle problem 
area ...not only to family studies, but to 
many other problem areas as well, notably 
personality types and the division of labor, 
cohesion in small groups, stable marginal 
adjustments, etc. (Rosow, 1957, p. 232). 


"It is through this fulfillment-of- 
complementary needs approach that 
further sociological studies should 
bear fruit" (Kephart, 1957). Appli- 
cation of the CN approach is being 
made to the field of marital counsel- 
ing and social work (Meyer, 1957; 
Winch, Martha, 1958). The Winch 
group has amassed some 11 separate 
publications treating complementar- 
ity; several dissertations; engendered 
four critical articles; and numerous 
derivative studies, at least four of 
which have been published. 

Yet no thorough appraisal of the 
data on which the theory of CN is 
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based has appeared.* That will be the 
next task of this review. Now let us 
examine Winch's procedures. 

The sample is described as: 
25 married undergraduate students in selected 
schools of Northwestern University of white 
race, middle class background, 19-26 years of 
age, second or later generation native-born of 
Christian or no specified religion; and their 
spouses (Winch et al., 1954). 


Twelve “needs” from Murray's well- 
known list, as well as three "general 
traits" were studied. Most of these 
variables were — "double-dichoto- 
mized," that is, rated separately for 
being operative within or without the 
marriage, and separately for operat- 
ing overtly or covertly, yielding 44 
subvariables. 

Three techniques were employed to 
garner information from which the 44 
subvariables could be quantified. 
First came the need interview, a 
Structured interview from which the 
following are the published sample 
questions: “how do you feel when 
someone steps in front of you in a 
queue in a crowded restaurant”; and 
"how do you feel when you see your 
name in print" (Winch, 1958). 

Second, a case-history interview 
was conducted. This (Winch, 1958), 
began with the subject's earliest memories, 
covered his percepts and experiences with key 
familial and other figures, and brought him 
through his various developmental stages to 
the present moment (p. 110). 


Thirdly, eight TAT cards were 
administered. (It should be noted- 
that the Cattell 16 PF, Form A, was 
also included, but of this we hear 
only a 1958 footnote characterizing 
the results as “largely negative" 
(Winch, 1958, p. 110). 


* Several of the assessments to be made here 
have adumbrations scattered through the 
literature and acknowledgements will be 
made below. 


Each subject was given a separate 
rating on each variable for the need 
interview (NI-1); for the case his 
(CH), and for the TAT (TAT-0). 
, The quantifying techniques for each 


should be noted. For NI-1, two 

judges rated each subvariable on a 

1-5 scale. Interjudge reliability is 

reported as .60. Ratings were sum- 
». med and normalized (Winch et al., 
1954). 

For the TAT, the same pro- 
cedure—content analysis—was fol- 
lowed. Interrater reliabilities were 
reported as in the range of .20 (Winch 
& More, 1956a, 1956b). “This pro- 
cedure was undertaken with essen- 
tially negative results” (Winch, 1958, 
p. 110). Following this (Winch, 1958) 
we undertook a mode of analysis on the need- 
interview, the case-history interview, and 
the TAT which might be called “global” or 
"molar" or "clinical" or “projective” or 
“holistic.” A different analyst worked on 
each of these three sources of information 
and sought, as far as the data would allow, to 
create a complete dynamic analysis of each 
subject...after writing such a report and 
on the basis of the analysis he had prepared, 
each analyst would then rate the subject on 
the 44 sub-variables" (pp. 110-111); 


thus, NI-2; CH; and TAT-C. r 
Still another set of ratings was to 
come—the FC (full-case conference). 


In order to arrive at a psychodynamic inter- 
pretation and a set of ratings for each subject 
in which we could place our greatest con- 
fidence, we formed a clinical conference of 
‘five persons . . . each analyst read and crit- 
_icized all three written reports... incon- 


, Sistencies were discussed, and relevant evi- 
y dence was examined ... after arriving at 
| what might be called “diagnostic” consensus, 
? all 5 analysts agreed on a final set of ratings 


A of the subjects' needs (Winch, 1958, p. 111). 


Thus, six sets of ratings are avail- 
able. Five are subsequently re- 
ported.? 

The FC was used as the criterion 


* TAT-O disappears. 
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for validity of the other indices. The 
general range of correlations between 
FC ratings and other sets are as fol- 
lows: NI-1, .60; NI-2, .80; CH, .74; 
and TAT-C, .00 (Winch & More, 
1956). 

The hypotheses to be tested with 
these data were derived from the 
theory of CN. The statistical tech- 
nique was the interspousal product- 
moment correlation, ie, husband's 
subvariable scores times their re- 
spective wives subvariable scores. Of 
1,936 possible interspousal correla- 
tions, 388 were hypothesized as to 
direction of sign: 344, involving 
different needs or traits, would be 
positive in sign; 44 involving the 
same need or trait, would be negative 
(Winch et aL, 1954). The specific 
relationships hypothesized have not 
been published. The general validity 
of the CN theory was staked on a 
chi square test for greater-than- 
chance occurrence of signs of correla- 
tions in the hypothesized directions. 

The results of interspousal correla- 
tional distributions NI-1, NI-2, and 
FC met this chi square test. CH did 
not. For TAT-C, the directionality 
of the distribution was reversed. 
Winch (1955a) concludes that “the 
bulk of the evidence, therefore, sup- 
ports the hypothesis that mates tend 
to select each other on the basis of 
complementary needs” (p. 554). 

A further analysis followed. Con- 
structing a Q-type matrix, the cor- 
relations for variables could be com- 
pared for married pairs versus non- 
married pairs, In this matrix of 625 
male-female correlations, 25 were of 
the former, 600 of the latter group. 
Testing for same-variable correla- 
tions, CN theory would f predict 
lower (and presumably negative) cor- 
relations for marrieds, and higher for 
nonmarried. In addition to compar- 
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ing the 25 to the 600, Winch also 
randomly matched each man with a 
woman not his wife and compared 
these correlations with the 25 hus- 
band-wife coefficients. In both cases, 
the NI-1 data demonstrate statisti- 
cally significant difference between 
mates' and nonmates' mean correla- 
tions and in the hypothesized direc- 
tions. The FC data do not show such 
differences. The NI-1 results are as 
follows: mean husband-wife correla- 
tion, .1016; mean man-woman cor- 
relation, .2316. The range of the 
husband-wife coefficients is from +.52 
to —.32. Nine of these were negative, 
16 positive. 

Cluster analyses (Winch et. al., 
1955), R-type factor analysis (Roos, 
1957), and Q-type factor analysis 
(Ktsanes, 1955) have been per- 
formed on the Winch data. All re- 
sults have been summarized in Mate 
Selection: A Study of Complementary 
Needs (Winch, 1958). This volume 
also contains speculative elaboration 
of CN theory, detailed case reports, 
etc. 

Now certain exceptions must be 
taken when it is is maintained that 
the case for complementary needs 
theory has been demonstrated: 

1. Sample—Of what population 
can 25 married undergraduate couples 
be taken as representative? 

2. Ratings—"'(Of the correlations) 
an indeterminate number could ac- 
tually have been spurious reflections 
of the raters' implicit theories of 
trait organization" (Katz, Glucks- 
berg, & Krauss, 1960, p. 205). This 
appraisal by Katz et al. was earlier 
voiced by Strodtbeck (1959). Bow- 
man (1955) and Kernodle (1959) 
have complained that sociologists, by 
the nature of their training, are not 
qualified to undertake psychological 

analyses such as an investigation of 
complementary needs requires. Per- 
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haps; but researches are to be judged 
by their fruit rather than their roots. 
Yet this psychologist cannot but 
wish that more account had been 
taken of the problems of rater sub- - 
jectivity—bias, projection, halo; as 
well as the issues of reliability and 
validity: in short, all the concerns of. 
those who deal in objective psycho- 
logical assessment. 

3. Statistics—Aside from the prob- 
able nonindependent nature of the 
variables, built in by the rating 
technique, there is the question of 
statistical nonindependence. In a 
distribution of intercorrelations, when 
Variables A and B are positively re- 
lated, and likewise B and C, the rela- 
tion between A and C cannot be 
taken as an independent event. In a 
matrix of 1,936 correlations which. 
are positively related throughout 
(Winch & More, 1956a, 1956b), the 
388 "tests" were not selected on a 
basis of posited independence of 
event, but without regard to this 
issue. Winch (1958) has recognized 
this problem, but commented that, 
"Just how many independent events 
there are is a very complex question” 
(p. 115). We agree. 

4. Results—The data, taken as 
they have been rated, analyzed, and 
reported do not support the CN 
hypothesis. Winch concludes that he 
is upheld by the bulk of the evi- 
dence—NI-1, NI-2, and FC; and not 
supported by only CH and TAT-C. 
Are not, however, NI-1 and NI-2 in 
reality two ratings on only one da- - 
tum? And are they not correlated — 
an average of .60 and .80 with the - 
third supporting set, FC? Rather | 
than winning by 3 to 2, comple- - 
mentarity appears to have lost by à 
to 1. (And if one is to consider | 
TAT-O and the 16 PF results, the — 
score becomes even more embarrass- 
ing.) 
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5. Research Philosophy—Almost 
any set of data, if sufficiently 
badgered, can be exhausted into sub- 
mission. 

6. Other Research—Bowerman and 
Day (1956), using 60 couples who 
were either formally engaged or regu- 
lar dating partners and who were 
drawn as volunteers from college 
sociology classes, attempted to test 
the CN hypothesis. Their instrument 
was the Edwards Personal Preference 
Schedule (EPPS); this offered an ob- 
jective measurement of 10 of the 
needs used by Winch, as both Ed- 
wards and Winch drew from Murray's 
need list. On same-need matching, 
more evidence for homogamy than 
for complementarity was found; on 
different-need matchings, no evidence 
for either principle of organization 
was unearthed. 

Winch (1957) insisted that this 
constitutes no replication, on the 
following grounds: (a) The EPPS, 
"though ingeniously conceived—has 
no known validity for measuring 
needs." Two other objections smack 
less of the pot and the kettle: (b) 
the Bowerman and Day subjects were 
not yet married; and (c) the vari- 
ables used were not identical (p. 336). 

These objections have been an- 
swered by Schellenberg and Bee 
(1960). One hundred college couples 
were investigated. Sixty-four were 
recently married, 18 engaged, and 18 
were going steady. The EPPS was 
again the measuring device. Con- 
sidering the marrieds and unmarrieds 
separately, and the 100 couples 
severally, all evidence was for homog- 
amy, not complementarity. This 
direction of association was statis- 
tically significant for marrieds and 
for the total group. 

But were they indeed measuring 
the same things as was Winch? Seven 
of the variables in the two studies 


were also derived for the Schellen! 
and Bee variables from the EP 
manual. The rank-order correlations 
between them were in the range .70- 
78. Half the remaining variance 
was attributable to the single need 
Nurturance, which was much more 
closely related to Succorance in the 
EPPS than in Winch. 

More recently, Katz, Glucksberg, 
and Krauss (1960), using 56 couples 
with a mean marriage length of 5 
years, incorporated EPPS data into 
Winch's husband-wife versus random 
pairs design. The results were over- 
whelmingly opposed to complemen- 
tarity. 


It is our judgment, in view of the 
foregoing discussion, that the comple- 
mentary-need hypothesis as now 
stated is not tenable. 

Due to cluster and factor analyses 
of his data (Ktsanes, 1955; Roos, 
1957; Winch, Ktsanes, & Ktsanes, 
1955), Winch believes at least two 
basic dimensions operate in marital 
patterning. He has labeled these 
dominant-submissive and nurturant- 
receptive. (One cannot fail to note the 
correspondence between the polar- 
ities and marital sex roles as ordi- 
narily conceived.) Applying these 
dimensions to his case histories, he 
discovers the following generaliza- 
tions which he then submits as 
hypotheses for verification. For ex- 
ample, irrespective of gender, indi- 
viduals who are high in Nurturance 
tend to mate with those who are 
highly Receptive and relatively non- 
Nurturant; individuals who are high 
in Dominance mate with those who 
are high in Submissiveness and rela- 
tively non-Dominant. Schellenberg 
and Bee (1960) tested these hypothe- 
ses with what appear to be the rele- 
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vant EPPS variables; the hypoth- 
eses were not confirmed. Yet Winch's 
case reports (Winch, 1958) are cer- 
tainly convincing; and further, in 
analyzing his 25 couples, he reports 
the following distribution: marriages 
in which the husband is dominant, 
13; in which the wife is dominant, 9; 
mixed dominance, 3. Not only Schel- 
lenberg and Bee, but also Lu (1952c) 
report a far greater proportion of 
equalitarian matings than Winch's 
couples exhibit. 

A further consideration strikes the 
reader of Winch's case histories. 
“One is impressed with the degree to 
which it is the recollections the sub- 
ject has of his parents (as he knew 
them between, say, 6 and 18) which 
either directly, or asa counterprocess, 
shapes his needs" (Strodtbeck, 1959), 
And most impressive is the extent to 
which it is both, or the cross-sex rather 
than the like-sex parent who is 
emulated. 

The suspicion grows that Winch's 
subjects are simply not typical of 
mate-selecting individuals. That 
they should be exceptional seems 
entirely reasonable, when one con- 
siders that they were drawn from a 
postwar, early marrying, GI Bill of 
Rights supported, campus group. 
Certainly one can take some excep- 
tion to any researcher's subjects, 
and this review cannot stand on 
psychosocial speculations concerning 
these individuals. However, any 
reader of Winch's case histories must 
be impressed with how far these indi- 
viduals veer from the generalizations 
proposed in this review as predictive 
of marital success. One would there- 
fore predict an unusual degree of 
disharmony and unhappiness in these 
marriages. Perhaps follow-up data 
will some day be available by which 
the accuracy of these remarks may 
be judged. 

Now if it be granted that the com- 


plementary-needs approach has not 
met with undue success, though mak- 
ing valuable contributions as to level 
of approach and research orientation, 
where has it gone awry? The answer 
to this question lies in developments 
observable in the entire enterprise of 
behavior analysis. The marriage 
relationship can be considered as a 
stimulus situation comprised of ex- 
pectations specific to marriage. These 
marriage roles can thus be expected 
to order (or even assign) the opera- 
tive needs of the individuals con- 
cerned. Assessment of needs not 
specific to marriage is clearly not the 
logical entre to predictive study.‘ 


ROLE THEORY 


The role-analysis approach to mar- 
riage research has had its advocates 
for many years. 


What the Freudians fail to recognize, and 
Mead left undeveloped, is the notion of mul- 
tiple patterns of role-taking in response to the 
varied demands of the groups in which the 
individual aspires to membership (Mowrer & 
Mowrer, 1951, p. 30). 


Kargman (1957) has argued for the 
efficacy of role analyses, as opposed 
to the intrapsychic approach, in 
enabling both counsellor and client 
to appreciate marriage-relationship 
problems. ^ Earlier, Mangus has 


‘In a stimulating article, Rosow charac- 
terizes Winch's dichotomizations as “the 
Operational assumption that people do not 
have general personality needs, but segregate 
these according to different social roles and 
gratify them on a role-specific basis; that is, 
some needs in one role and others in another” 
(Rosow, 1957). However, there is nothing in 
published accounts of the interviews which 
demonstrate that they were adequate for (or 
even conceived so as to provide) marriage- 
role-specific assessment of needs; further, 
neither hypotheses nor results are reported 
which give any evidence of a within-without 
Patterning or effect. We must agree, how- 
ever, to the extent that Winch's work con- 
tains the ungerminated seed of the theoretical 
tree which we hope shall fructify in the fol- 
lowing section. 
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offered an elaboration of role theory 
as it may be applied to marriage 
counselling. He offers sample hy- 
potheses, e.g., the integrative quality 
of a marriage is a function of role 
perception, role expectation, and 
role performance of marital partners. 
This paper, along with that of Sarbin, 
may well be read for expositions of 
general role theory (Mangus, 1957; 
Sarbin, 1954). Research in marriage 
roles was active by at least 1950, 
The most sophisticated psycho- 
social treatment of marriage relation- 
ships now available is that of Parsons 
and Bales (1955), which consequently 
deserves a brief resume here. Parsons 
demonstrates that in the processes of 
development, need dispositions, ob- 
ject relations, and identifications are 
inextricably related; so that although 
needs may certainly be considered as 
relatively enduring, as an individual 
finds himself engaged in a given social 
interaction, or assuming a given social 
role, this situation organizes (by dif- 
ferential orderings, rankings, and 
valences) the enduring need units. 
Any theory of action must deal not 
with the isolated units but with the 
role-ascribed organization of these 
units. Thus, “The role expectation 
-« . îs itself also a motivational unit” 
(Parsons & Bales, 1955, p. 107). 
Parsons (Parsons & Bales, 1955) 
offers this pretty metaphor: 
-.. highly differentiated need-dispositions 
constitute a kind of "key-board." A given 
role-orientation is a "tune" played on that 
keyboard. Many different tunes will strike 
the same notes but in different combinations, 
and some will be altogether omitted from 
Some tunes , . . the pattern of the tune is not 
deducible from the structure of the key- 
board (p. 171). 


The two dominant leit-motifs are 
the male and female sex roles. Fol- 
lowing an analysis of child socializa- 
tion in terms of family structure, 
Parsons (Parsons & Bales, 1955) con- 
cludes: 
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If this general analysis is correct, then the 
most fundamental difference between the 
sexes in personality type is that, relative to 
the total culture as a whole, the masculine 
personality tends more to the predominance 
of instrumental interests, needs and functions, 
Presumably in whatever social system both 
sexes are involved, while the feminine per- 
sonality tends more to the primacy of expres- 
sive interests, needs and functions. We would 
expect, by and large, that other things being 
equal, men would assume more technical, 
executive, and "judicial" roles, women more 
supportive, integrative and “tension-manag- 
ing" roles (p. 101). 

These principles he then applies to 
marriage roles. In Parsons' system, 
there are two primary axes of per- 
sonality differentiation, power and 
instrumental-expressive. In marriage, 
power equalization is the norm.* As 
to the instrumental-expressive axis, 
... the husband has the primary adaptive 
responsibilities, relative to the outside situa- 
tion, and that internally he is in the first 
instance ‘‘giver-of-care,” or pleasure, and 
secondarily the giver of love, whereas the 
wife is primarily the giver of love and sec- 
ondarily the giver of care or pleasure (Parsons 
& Bales, 1955, p. 151). 

The husband-wife relationship is, 
of course, a subsystem of the family 
collectivity, which involves the per- 
formance of many roles. For ex- 
ample, the woman as mother must 
adopt instrumental primacy vis-a-vis 
her child, while the child in his role 
functions with expressive primacy. 
Obviously, the number, sex, and 
temperament of children which come 
to a couple must affect profoundly all 

5 Research seems to substantiate this 


assertion. Most marital partners see power 
equality in their roles. As to the effects on 


"power distribution of extramarital role 


variations, e.g., working wives versus house- 
wives, there is disagreement (see Heer, 1958 
versus Blood & Hamblin, 1958). But the re- 
lationship between power-as-need and power- 
as-influence is unresolved in the literature. 
This is unsurprising, since the relationship of 
motive to behavior constitutes a key dilemma 
in psychology. This situation also highlights 
the importance of role-specified needs as a 
construct, offering as it does a potential 
solution to this basic theoretical issue. 
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dimensions of marital patterning and 
outcome. The limitations of-this 
essay, however, allow no more than 
briefly noting this important caution 
(see Farber & Blackman, 1956). 

Parsons' formulations are not sim- 
ple, yet the level of complexity is 
appropriate to thatof the phenomena. 
The theory has, however, outstripped 
research verification. It is our next 
task to review marriage-role re- 
searches, comparing their results to 
our own and to Parsons' generaliza- 
tions. 

In the first place, McGinnes (1958) 
repeated the study of Hill (1945) on 
campus values in mate selection. 
Subjects rated the importance to 
mate selection of 18 personal char- 
acteristics (emotional stability, good 
health, chastity, etc.). Remarkable 
consistency was demonstrated be- 
tween the two studies, separated in 
time by 17 years. Shifts occurred 
principally in those items most 
clearly related to ‘‘companionate”’ 
marriages, and thus predictable from 
the generally-accepted view that 
marriages are shifting from “tradi- 
tional” to "companionate" structur- 
ing. Role expectations, then, may be 
held to exhibit reasonable stability 
over time. 

Shifts do, however, occur in indi- 
viduals over time. Different patterns 
of traits—both those desired in the 
partner, and those believed to be 
important by the partner—are evi- 
dent when subjects have reference to 
marriage partners than when refer- 
ence is to dating partners (Hewitt, 
1958). Marriage role expectations 
are held to differ according to court- 
ship stage (Hobart, 1958).* 


* Langhorne and Secord (1955) do not find 
ideal-mate conceptions differing by age or 
marital status. But their variables are need 
units (see below), and do not appear com- 
parable to these studies, which deal with 
specific traits. 
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Occurring within a context of basic — 
similarity, then, an individual's ex- — 
pectation shows differences according 
to the mate role in which he operates. 
The courtship is somewhat different 
than the marriage role. Parsons has 
predicted this difference, and sug- 
gests that it springs from need 
achievement, which operates force- 
fully through date selection and 
courtship, then much less saliently in 
marriage roles. It is impossible to 
verify this explanation with data now 
available, but the phenomenon of dif- 
ference within basic similarity stands. 
It will be recalled that many dimen- 
sions, demonstrably involved in as- 
sortative mating, are found to occur 
intensified in more satisfactory mar- 
riages. A hypothesis for further in- 
vestigation therefore offers itself; 
the greater the concordance between 
courtship and marriage role—that is, 
the less salient during courtship are 
those variables nonrelevant to mar- 
riage-roles (e.g., need achievement)— 
the greater the probability of marital 
success. 

Investigations have been made of 
the effect of role disagreement on 
marriages. Jacobsen (1952) found 
that divorced couples exhibit a greater 
disparity in their attitudes toward 
the roles of husband and wife in 
marriage than do married couples. 
But Hobart and Klausner (1959) 
found no relationship between 
marital-role disagreement and mari- 
tal satisfaction. The published ex- 
amples from the questionnaires used 
in these two studies offer no oppor- 
tunity for comparison as to equiva- 
lence. Neither study, however, used 
a random man-woman pairing con- 
trol (or its equivalent) as should 
certainly be done following Corsini 
(1956a, 1956b). Couch (1958) found 
concensus on husband and wife roles 
to increase with length of marriage, 
as did accuracy in assuming the role 
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of the other mate. The study, how- 
ever, was cross sectional rather than 
longitudinal. Couch offers it prin- 
cipally for its methodological and 
conceptual interests, which it indeed 
possesses. 

The most ambitious attempt to 
test Parsons' hypotheses has been 
that of Farber (1957). The questions 
raised by this study are many and 
important; adequate consideration 
requires a somewhat detailed exam- 
ination. Parsons and Bales (1955) 
make the broad assignment of task- 
oriented roles to the husband, and 
socioemotional roles to the wife (each 
role being subordinate to the common 
value system). Farber notes that for 
the home-centered woman, and espe- 
cially for the wife with children, less 
opportunity for variation from the 
socioemotional matrix is possible 
than is the case for the more mobile 
husband. Therefore, marriage inte- 
gration is more dependent on the 
husband's conformity to the wife's 
values than vice versa. Farber uses 
three variables: 

1. Marital Integration: measured 
by the number of times husband and 
wife rate self or spouse as stubborn, 
gets angry easily, feelings easily hurt, 
nervous or irritable, moody, jealous, 
dominating or bossy, easily excited, 
easily depressed, and self-centered. 

2. Perceived Similarity between 
Self and Other: (husband and wife, 
husband and child, wife and child, 
etc.) measures for this variable are 
derived from the same ratings of the 
10 traits listed for Variable 1. 

3. Socioemotional Valuation in In- 
teraction: measured by the following 
five values, which, along with others, 
were ranked by subjects in order of 
importance: (a) “companionship,” 
the family members feeling comfort- 
able with each other and being able 
to get along together; (b) ‘‘personality 
development,” continued increase in 


family members’ ability to under- 
stari and get along with people and 
to accept responsibility; (c) "satis- 
faction" of family members “with 
amount of affection shown," and of 
the husband and wife in their sex 
life; (d) "emotional security," feeling 
that the members of the family really 
need each other emotionally and trust 
each other fully; and (e) "a home," 
having a place where the family mem- 
bers feel they belong, where they feel 
at ease, and where other people do 
not interfere in their lives. 

From the foregoing, Farber hy- 
pothesizes: 

1. The ranking of items relating to 
socioemotional aspects of interaction 
by wives tends to be higher than the 
ranking by their husbands. 

2. The degree of marital integra- 
tion varies directly with the ranks 
assigned by the husband to domestic 
values pertaining to socioemotional 
aspects of interaction. 

Then, using 99 couples, trained 
interviewers, and ending in a dazzling 
(indeed sometimes blinding) display 
of mathematical manipulation, he 
accepts all four hypotheses as cone = 
firmed at the .05 level. But this study - 
requires scrutiny. As for Hypothesis 
1, Parsons’ prediction of husband- 
wife differentiation in marrige roles 
along an instrumental-expressive axis 
is confirmed. But as for Hypothesis 
2, when one examines the instruments 
used to measure the two variables 
involved, one concludes that it is 
demonstrated only that “husbands 
who are dedicated to getting along 
well in the family tend to occur in 
familes which get along well.” That 
is, marital integration is indexed by 
the same concept used to index socio- 
emotional role taking. Small wonder 
that they coincide.’ 

1 Farber also investigates other hypoth- 
eses. Inspection of the measuring devices 
again demonstrates conceptual nonsepara- 
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But this exception taken to Far- 
ber's design can also be taken to 
the majority of marital studies in the 
literature. For all the objective 
measures of marital satisfaction now 
current are heavily weighted with 
indices of togetherness, of agreement, 
of interpersonal smooth sailing. As 
a typical example, on the recently 
published Locke-Wallace short test 
(Locke & Wallace, 1959), 11 of the 
15 items seem related to social emo- 
tional integration via agreement and 
togetherness. Certainly less friction 
in marriage may produce greater 
durability. Buerkle, Anderson, and 
Badgley (1961) have factor analyzed 
responses to Yale Marital Inter- 
action Battery, composed of endorse- 
ments of alternative solutions to 
marital conflict situations. Factor 
scores were computed, and differences 
between adjusted marital groups and 
nonadjusted groups reported. The 
nonadjusted marriages were at that 
time being counseled for marital 
difficulties. The adjusted were mar- 
riages drawn from religion-affiliated 
groups. (The authors recognize the 
problem of accepting this group as 
adjusted.) At any rate, adjusted 
husbands were more likely to submit 
to wife domination and to grant the 
wife greater deference and respect. 
Adjusted wives were more likely to 
defer to the husband's judgment, and 
to expect less deference and respect 
from their husbands. 

This study does not, however, 
speak to the issue of marital happi- 
ness. And, if considerations of 
“socioemotional integration” are less 
salient for the man in his marital 
role, it would seem that his marital 


tion of the independent and dependent vari- 
ables. Farber maintains that statistically, 
the indices need not covary. However, 
identical thermometers may also vary inde- 
pendently; our point is that Farber has 
placed two thermometers in a single solution. 
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happiness and success must be as- 
sessed by indices more pertinent to 
the satisfaction of his own peculiar 
motives. 

We turn now to the important 
work of Langhorne and Secord (1955). 
When role expectations are analyzed 
separately for men and for women, in 
terms of motivational units, an impres- 
sive difference occurs; that is to say, 
women need different things from 
husbands than husbands need from 
wives. Langhorne and Secord have 
performed the service of describing, 
empirically, this difference. In six 
states (Virginia, Georgia, Mississippi, 
Ohio, Kansas, & Wyoming) 5,000 
college and university students were 
asked to list, on blank paper, those 
traits which were desired in a mate. 
The authors then categorized (arbi- 
trarily, but rather convincingly) the 
traits into need units adapted from 
Murray’s list. Significant differences 
in need patterns did not occur by age, 
marital status, or geographical re- 
gion. Differences by sex were signifi- 
cant both statistically and theoreti- 
cally. 


Women are more concerned than men with 
receiving affection, love, sympathy, and 
understanding from their spouse, although it 
should be recalled that (this) need is one of 
the strongest of both sexes. Secondly, males 
are more desirous of having a spouse who is 
neat and tidy about the home, and who will 
adjust to a routine, avoid friction, be even- 
tempered, home-loving, reasonable and de- 
pendable, than are females.... Another 
category with a relatively large absolute dif- 
ference is (Social stimulus value). Men are 
more concerned about the impression their 
future wife will make upon their friends and 
acquaintances than are women about the 
impression their future husband makes upon 
other persons. Women also stress (Achieve- 
ment) more than males: in the present group 
not a single male listed an achievement trait 
as desirable in his future wife, whereas 6.8 
per cent of the traits listed by women were in 
this category. Included here are such traits 
as getting ahead, ambitious, enjoys working, 
energetic, has high status profession, etc- 
(Langhorne & Secord, 1955, p. 32). 
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It is obvious that analysis is 
needed to determine if such grouped 
traits do indeed covary. However, 
Langhorne and Secord's groupings do 
no violence to customary conceptions 
of the need units employed. And 
their results are provocative. Not 
only does an individual wish the 
spouse to conform to the appropriate 
sex role, but notice particularly that 
heavily emphasized are those attri- 
butes which implement performance 
of the respondent's own role. The 
wife, whose role responsibility is 
socioemotional, wishes a husband 
who will work with her in an atmos- 
phere of loving intimacy; the “‘instru- 
mental" husband wishes a wife, who 
through her attractiveness and effi- 
ciency, implements his responsibility 
for instrumental success. This does 
not contradict the concomitant de- 
sires, respectively, for an achieving 
husband and for a loving wife. 

Thus, following Parsons and Far- 
ber, we might indeed expect marital 
satisfaction to increase as the hus- 
band's socioemotional valuation in- 
creases. But for the wife, not neces- 
sarily for the husband whose marital 
satisíaction might be more reliably 
forecast by the degree to which his 
mate assists the performance of the 
male-instrumental expectations. 

The obvious extension of this 
emerging generalization was made a 
decade ago by Ort (1950). His basic 
hypothesis is: 
the amount of self-judgment of “happiness” 
or “unhappiness” in marriage depends upon, 
or is at least related to, the number of con- 
flicts between role expectations and roles 
played by the subject, and role expectation 
for the subject's mate, and the roles played 
by the subject's mate, as the subject sees it. 


Fifty married couples were verbally 
queried on 22 issues, for example: 
Should a husband kiss his wife when 
he leaves for work? Should a hus- 
band expect to win most arguments? 


Both expectations concerning these 
roles and the subjects perception of 
the performances of them in the mar- 
riage were recorded. The number of 
conflicts between expectation and 
performance (for selí and for mate) 
were totaled for each individual. 
Conflict totals correlated —.83 with 
subject's self-rating on a 10-point 
happiness scale. 

Ort concludes that happiness lies in 
the individual playing the role he 
expects, and in having the spouse play 
the role expected of him or her, re- 
gardless of what these roles might be. 


The author interviewed certain couples who 
tered i 


en! marriage expecting to be c | 
promiscuous and with those expectations 
filled, their self-evaluation was number one. 
Likewise the author interviewed couples who 
had expected fidelity for the self and the 
mate and were fulfilling tbese expectations 
and they also gave themselves a 

rating of number one (Ort, 1950, p. 697). 

As noted above, Hobart and 
Klausner (1959) found no relation 
between marital-role disagreement i 
and marital satisfaction. In their 
study, 70 items of a role inventory (a 
mailed questionnaire) were end 
from one thru five, and role disagree- 
ment calculated through the sum of 
the 70 differences between spouses, 
These authors intepret their results 
as refuting Ort. Note however, that 
they did not investigate expectations 
satisfied, which is central to Ort's 
thinking. 

Hurwitz (1959) has pursued the 
issue of expectation and satisfaction 
with his Index of Strain. Ten role 
items (e.g., “I am a companion to my 
wife") were ranked twice by each 
husband and each wife, first for the 
subject's performance, and second for 
the expectations of the spouse's be- 
havior. The Index of Strain is the 
cube root of the sum of the cubes of 
the differences between the ranks the 
subjects assign to each role. Hurwitz 
reports the following results. 
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The Index of Strain is significantly 
higher for husbands than for wives. 
That is, wives conform more to hus- 
band's expectations than husbands 
do to wives’. The husbands’ and 
wives’ Index of Strain correlate +.20. 
The correlation of the Index of Strain 
with the Locke-Burgess-Cottrell 
Marital Adjustment Scale are as 
follows: the husband's Index of 
Strain is —.22 with their own marital 
adjustment, and —.23 with the wives’ 
marital adjustment. Yet the wives’ 
Index of Strain is significantly cor- 
related with neither their own nor 
their husbands’ adjustment! 

This is another demonstration that 
the husband's is the key role in mari- 
tal success. Though the relationships 
between expectation and satisfaction, 
even for husbands, are not as strong 
as in Ort's study, this must be in part 
attributable to the difference in the 
sample of role items. 

Certainly satisfaction now seems 
related to happiness, perhaps tauto- 
logically. But satisfaction of what? 
Ort suggests role expectations. Katz, 
Glucksberg, and Krauss (1960) in- 
vestigated some aspects of need satis- 
faction. Subjects rate the satisfaction 
provided by the mate on each of 11 
EPPS variables. These satisfactions 
were also totaled for each subject. 
For wives, the totals were positively 
related to their own scores on Nur- 
turance and Succorance; positively 
related to their husbands' scores on 
Nurturance and Achievement; and 
negatively to husbands' scores on 
Abasement and Autonomy. The 
totals for husbands were positively 
related to wives' scores on Succorance 
and Nurturance, and negatively to 
wives Autonomy and Dominance. 
If any generalization can be drawn 

from these complex results, it would 
seem that individuals' needs are best 
satisfied within the marriage when 


both husband and wife operate 
with something like conventionally- 
expected sex roles, modified by need 


constellations allowing companionate 


marriage structure. 

One additional study, done early 
but infrequently cited, has attempted 
to assess need satisfaction’s effect— 
in this case, on mate selection 
(Strauss, 1947). Three-hundred 
seventy-three engaged or recently 
married subjects checked items on a 
questionnaire, if these items de- 
scribed one of their major needs. 
Later, they were asked if the mate, 
or other persons, satisfied these needs. 
Only 8.0% of the population appraised some 
other person as having satisfied their needs 
better than had the actual mate. As high a 
percentage as 89.2 appraised no other person 
of opposite sex as having filled major needs 


better than had the mate (Strauss, 1947, p. 
333). 


The distribution of needs satisfied 
appears to be highly skewed posi- 
tively. This study deserves replica- 
tion with adequate controls and an 
assessment technique which is less 
crude. But Strauss' study possesses 
the virtue of investigating needs ap- 
propriate to within-marriage con- 
siderations; for example, a neéd for 
"someone who shows me a lot of 
affection"; "who helps me in making 
important decisions"; "for someone 
who loves me.” 

The crucial issue now facing mar- 
riage-role researchers seems to be the 
identification of the crucial dimen- 
sions of marriage-role expectations 
and performances. These dimensions 
must be established through the ob- 
served covariation of discrete action 
units. The author is currently en- 
gaged in such an enterprise. 


SUMMARY AND CONCLUSION 


Let us now draw together summary 
generalizations from our study of 


p — 2 


m ———— 
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existing research and theoretical ma- 
terials. 

Mates are selected from a field of 
eligibles. This field is determined by 


:homogamy as to race, ethnic origin, 


social class, age, religion, and by resi- 
dential propinquity. Exploration of 
this field is a function of unknown 
psychological variables. ^ Cultural 
homogamys provide for a measure of 
similarity between mates with respect 
to social, value, and personality char- 
acteristics. Mate-selection (court- 
ship) roles manifest patterns of needs 
and expectations which differ in con- 
tent and organization from marriage 
roles. The greater the congruence be- 
tween the two roles, the greater the 
likelihood of marital satisfaction. 
Modal role definitions exist, and are 
sex-differentiated. They are provided 
for by parental identifications. The 
husband role is the more instru- 
mental, the wife role the more expres- 
sive-integrative. The wife being 
therefore more accommodating, the 
husband more rigid in role needs, the 
likelihood of marital success is a func- 
tion of the husband’s possession of 
the expected instrumental needs and 
capacities. Many individuals and 
marriages are not organized along 
these modal principles. The more 
general statement, therefore, is that 
marital satisfaction is a function of 
the satisfaction of needs and/or ex- 
pectations specific to husband and 
wife roles. 

The author recognizes that these 
are largely unverified hypotheses. 
They are, however, reasonably inter- 
related and made worthy of research 
effort by an existing body of data. 
This approach, however, has a serious 
limitation: it is largely restricted, by 
empirical data now available, to con- 
siderations of assortative mating and 
happiness in mating. Surely we must 
enlarge our view, in order, for ex- 


ample, to investigate developmental 
processes in marriage with Foote 
(1956), and psychological 
processes with Uhr (1957). 

Other reviewers might well ab- 
stract generalizations quite different 
from those presented here. Any 
analyst's eyes are focused by his own 
convictions, and the author's own 
might be made explicit here: role 
theory provides the best available 
framework for investigation of psy- 
chological phenomena in marriage; 
and, psychologists may well apply 
their skills to these issues—issues of 
pressing practical, ameliorative, and 
basic theoretical concern. 
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BACKWARD MASKING! 


DAVID H. RAAB 
Brooklyn College 


In backward masking, perception of a test stimulus is suppressed by 
masking stimulation that is presented subsequently. Psychophysical 
studies of this phenomenon have utilized visual, auditory, and cutaneous 
stimuli. These masking studies are reviewed and their results discussed 
in terms of possible neural mediating mechanisms. 


As used in the study of hearing, 
the term masking denotes “the proc- 
ess by which the threshold of 
audibility for one sound is raised by 
the presence of another (masking) 
sound" (American Standards Asso- 
ciation, 1960, p. 46). The term has 
long been used to designate analogous 
effects in olfaction (see Geldard, 1953, 
Ch. 14), and Boynton and Kandel 
(1957) and Békésy (1959) have re- 
cently used it in describing visual and 
cutaneous interactions. 

Typically, the absolute threshold 
for the test stimulus is measured with 
and without the masking stimulus; 
the difference between the two thresh- 
olds is the amount of masking. G. A. 
Miller (1947) has pointed out that 
discrimination of mask from mask- 
plus-probe involves a kind of intensi- 
tive difference limen. Tanner, Swets, 
and Green (see Green, 1960) have 
analyzed such instances of detecting 
signals in noise in terms of a model 
based on statistical-decision theory. 

Two extensions of the masking 
paradigm are worthy of note. First, 
the interfering effects of a mask are 
not restricted to the region of the 
threshold. Partialmasking—at supra- 
threshold levels—can be noted. Sec- 
ond, the masking and test stimuli 
need not be presented simulta- 
neously. Forward and backward 
masking can occur. 

1 Supported by a grant (G 16349) from the 
National Science Foundation. Thanks are 
due William H. Ittelson for his critical reading 
of a first draft of this paper. 
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In the case of forward masking, the 
aftereffects of stimulation are ex- 
plored by means of a probe stimulus 
presented at various times after the 
cessation of the mask. Short-term 
reversible masking effects are com- 
monplace. Although it is not the pur- 
pose of this paper to review the very 
extensive literature dealing with 
adaptation, sensory fatigue, recovery, 
etc., we may note here that the mask- 
ing and probe stimuli in these experi- 
ments are frequently called ‘‘condi- 
tioning” and “test” stimuli, respec- 
tively. The time between the two 
stimulus onsets is often symbolized 
by “Ag.” 

In backward masking, the test 
stimulus precedes the mask in time; 
suppression is apparently retroactive 
in character.? The fact that a prior 
stimulus can somewhow be rendered 
less effective by subsequent stimula- 
tion is intriguing. Itis not surprising 
to find, therefore, that the literature 
contains more than 70 studies of 
backward masking, perceptual blank- 
ing, erasure, etc. Since these experi- 
ments have provided us with re- 
sults—at threshold and above— 
which bear significantly on notions 
about neural mechanisms in percep- 
tion, it seemed worthwhile to review 
them. In all but two of the studies, 
either visual or auditory stimulation 
was used. More than two-thirds of 


If stimulus asynchrony is always meas 
ured from the onset of the mask, the back- 
ward masking experiment is characterized by 
negative values of At. 
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the papers appeared within the last 
10 years—an expression, very prob- 
ably, of increasing interest in tem- 
poral factors in vision and hearing. 


AUDITORY BACKWARD MASKING 


R. L. Miller (1947) masked one 
train of periodic tone bursts by 
another. Frequency of modulation 
(i.e., burst frequency) was the same 
for masking and probe tones. In 
general, masking increased as the 
tone bursts in the two trains were 
phase shifted to occur more closely in 
time, Maximum masking was found 
when the probe tone bursts preceded 
the masking bursts by 1 to 2 msec. 
The amount of this time difference 
was independent of burst frequency, 
but was a function of the tonal fre- 
quencies in the two trains. 

Almost 10 years passed before 
backward masking of one sound by 
another was again explored. Samoi- 
lova's (1956) experiments were begun 
in an effort to understand self-mask- 
ing effects encountered in the percep- 
tion of high intensity speech; the 
durations chosen for the probe (20 
msec.) and masking (300 msec.) tones 
were considered to approximate the 
durations of consonants and vowels. 
Samoilova's results showed backward 
masking to increase monotonically 
with increasing intensity of mask, 
with decreasing duration of probe, 
and with decreasing interval between 
stimulus onsets. Maximum masking 
by 500-cycle and 1,000-cycle tones 
occurred with probe frequencies of 
approximately 550 and 1,300 cycles, 
respectively. Some masking (about 
5 db.) of a 1,200-cycle probe by a 
1,000-cycle mask (SL=90 db.) was 
found with At as large as —1,020 
msec. (Samoilova, 1959b). 

In a further study (Samoilova, 
19592) of backward masking as ob- 
tained with different stimulus fre- 


quencies, Af was maintained at —30 
msec. Of interest is the finding of as 
much or more masking of a 1,000. 
cycle probe by 4,000 cycles as of a 
3,000-cyce probe by 1,000 cycles. Al- 
though the highly trained subjects in 
this study showed les backward 
masking than less practiced listeners, 
differences within and among indi- 
viduals were still considerable. 

Pickett (1959) and Elliott (1962b) 
used 50-msec. bursts of white noise 
to mask brief, 1,000-cycle tone bursts. 
With masking stimuli of high inten- 
sity (90 db. SPL or more), backward 
masking of 5-msec. tone bursts was 
found to extend beyond Af —55 
msec. Less masking—especially so, 
at small values of At—was found 
when the masking noise and probe 
tones were led to separate ears 
(Elliott, 1962b). 

In another experiment, Elliott 
(1962a) used probe tones of 500, 
1,000, and 4,000 cycles (burst dura- 
tion=7 msec.). For both monotic 
and dichotic listening, and with At 
greater, negatively, than —12 msec., 
the 500-cycle tone was masked most 
and the 4,000-cycle tone masked 
least. With smaller values of asyn- 
chrony, two other orderings of probe 
frequencies emerged, one for monotic, 
the other for dichotic stimulation, 

Temporal masking of clicks by 
clicks was studied by Chistovich 
and Ivanova (1959) and by Raab 
(1961). With the masking click at 
approximately 65 db. SL, backward 
masking was found in the region be- 
tween At=0 and At=—10 msec. 
More masking, extending beyond 
At=—10 msec., was produced by a 
more intense masking click. In both 
studies, a rise in test click threshold 
beyond At=—50 msec. was en- 
countered. This effect was related to 
changes (with changing A?) in the 
listener’s criterion for detection 
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(Chistovich & Ivanova, 1959), and to 
increased difficulty in fixing precisely 
the time of occurrence of the probe 
when it is not followed closely by the 
mask (Raab, 1961). 

Short-term backward masking be- 
tween suprathreshold clicks was de- 
scribed by Guttman, van Bergeijk, 
and David (1960), who presented two 
clicks to one ear and a single click 
to the other ear, Binaural fusion of 
the single click with one of the two 
pulses delivered to the test ear was 
then investigated. With the mon- 
aural clicks separated by 2 msec. 
(At), and with the second more in- 
tense, it alone controlled binaural 
fusion with the (contralateral) probe. 
Although the first (weak) click was 
not always inaudible monaurally, its 
ability to be centered binaurally was 
retroactively masked. 

Gol'dburt (1961) found backward 
masking of suprathreshold tone bursts 
to extend beyond At= — 1,000 msec. 
The masking stimulus was a 400- 
msec., 100-cycle burst at 100 db. 
SPL. Test stimuli (100-cycle bursts) 
were varied in duration between 5 
and 400 msec. Presented before the 
masking sound, these audible test 
stimuli were perceived to beshortened 
in duration as compared with un- 
masked tone bursts. Test stimuli 
(duration «50 msec.) at sensation 
levels in excess of 90 db. were 
phenomenally shortened by the 
masking tone. For masked and un- 
masked bursts (at 80 db. SL) to ap- 
pear equal in duration, a seven-fold 
increase in duration was required on 
the former. 


CuTANEOUS BACKWARD MASKING 


Two studies of temporal masking 
were found which did not use visual 
or auditory stimulation. Schmid 
(1961) measured thresholds for elec- 
trical pulses delivered to the third or 
fourth digit of one hand. Condition- 


ing shocks were applied to the index 
finger. Maximum masking occurred 
at small, positive values of At (1-5 
msec.). | Backward masking was 
found up to Af=—10 msec. for one 
subject and up to Aż = — 30 msec. (the 
largest temporal separation used) for 
the other subject. 

Halliday and Mingay (1961) de- 
livered conditioning shocks to one 
forearm and test shocks to the other. 
Test stimulus thresholds were found 
to be slightly elevated with Aż as 
large as —100 msec. 


VISUAL BACKWARD MASKING 


Three procedures have been used 
to produce retroactive masking in 
vision. In the main, these differ from 
each other with respect to the spatial 
size and location of the masking 
stimulus. In the case of the Broca- 
Sulzer effect, test luminance and 
masking luminance are delivered to 
the very same target area. Indeed, 
masking is produced with a single 
flash simply by lengthening its dura- 
tion. In metacontrast suppression, 
the masking stimulus occupies areas 
adjacent to (but not including) the 
test patch. In the case of the Craw- 
ford effect, the test area is wholly in- 
cluded within a larger masking stim- 
ulus area. The results obtained with 
each of the three procedures will be 
considered in turn. 


Broca-Sulzer Effect 


In 1902, Broca and Sulzer reported 
that the brightness of a white target 
increased with exposure duration up 
to about 50 msec. and then dimin- 
ished as the stimulus was prolonged to 
250 msec. In effect, the later mo- 
ments of stimulation in the longer 
flash acted to attenuate the bright- 
ness evoked by the luminous energy 
delivered earlier. 

Bills (1920) and Kleitman and 
Piéron (1924) reviewed the early 
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literature dealing with the growth of 
brightness as a function of stimulus 
duration. 

Stainton (1928), using the method 
of Broca and Sulzer, matched various 
short-duration test flashes to a sur- 
rounding light of much longer dura- 
tion by adjusting the luminance of 
the latter. The results confirmed the 
findings of Broca and Sulzer that the 
“overshooting” in the growth of 
brightness with flash duration is more 
pronounced and occurs with shorter 
durations as luminance level is in- 
creased, 

Baumgardt and Segal (1942-43) 
compared simple reaction times (RTs) 
to 50-msec. and 250-msec. flashes and 
found that the briefer (and brighter) 
stimuli evoked shorter RTs. Raab, 
Fehrer, and Hershenson (1961) noted, 
however, that since RT to the longer 
flash was only about 150 msec., the 
last 100 msec. (or more) of stimula- 
tion could not have affected the RT. 
With flash duration randomized over 
trials, RT was then found to be un- 
affected by changes in duration be- 
tween 10 and 500 msec. at luminance 
levels at which the Broca-Sulzer 
effect is found. Minimum RT at a 
given luminance was subsequently 
found to be given by much shorter 
durations than are required for the 
full growth of brightness (Raab & 
Fehrer, 1962). 

Magnitude estimations (see Ste- 
vens, 1956) of the brightness of 
single stimuli varying in both lumi- 
nance and duration show the Broca- 
Sulzer effect (Raab, 1962). Although 
this procedure for assessing bright- 
ness affords much less precision than 
matching (null) methods can provide, 
it does avoid the contaminations— 
inherent in the latter—arising from 
asynchronous stimulation of adjacent 
retinal areas (see Alpern, 1953). 

Katz (1959) changed the Broca- 
Sulzer procedure by providing his ob- 


servers with a standard flash (dura- 
tion* 200 msec.) at several fixed 
luminance levels and adjacent, short- 
duration test flashes. Standard and 
test stimuli were terminated together 
on each trial. Monocular and haplo- 
scopic brightness matches were made 
by adjusting the luminance of the 
test flashes. By providing his sub- 
jects with a constant reference stand- 
ard, it was possible to show that the 
Broca-Sulzer overshooting is not a 
true enhancement effect; to obtain 
equally bright 8-msec. and 50-msee, 
flashes required less energy in the 
briefer stimulus (see Boynton, 1961, 
p. 741). ‘ 

Raab and Osman (1962) showed 
Katz’s method to be sensitive to 
variations in the temporal overlap of 
the standard and test flashes. With 
the stimuli terminating together, 
Katz's results were confirmed. Dif- 
ferent functions relating the growth 
of brightness to duration were gen- 
erated, however, when the flashes 
matched to each other had coincident 
mid-durations or coincident onsets. 
In the last instance, for example, the 
Broca-Sulzer effect was reversed, i.e., 
50-msec. flashes appeared fainter 
than 200-msec. flashes of equal 
luminance. Baumgardt and Segal 
(1942-43) had earlier compared adja- 
cent targets of unequal duration (50 
and 100 msec.) when the flashes be- 
gan together and when they ended 
simultaneously. 


Metacontrast 


“The brightness of a flash of light 
is considerably reduced when it is 
followed by a second flash in an 
adjacent region of the visual field” 
(Alpern, 1953, p. 648). This phe- 
nomenon, termed Metakontrast by 
Stigler (1910), was first explored sys- 
tematically by Stigler and by Baroncz 
(1911). Alpern (1952) has sum- 
marized Stigler’s and Baroncz ex- 
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periments and has reviewed prior 
studies of related phenomena. 

An instance of retroactive suppres- 
sion at suprathreshold levels, meta- 
contrast was rediscovered by Fry 
(1934), who arranged to mask the 
center one of three rectangular tar- 
gets by flashing it before lighting the 
two flanking areas. Similar stimulus 
arrays were used by Piéron (1935b) 
and by Alpern (1953). Alpern es- 
sayed to measure the metacontrast 
effect by means of binocular bright- 
ness matching between the masked 
stimulus and a comparison-standard. 

The results of Alpern’s detailed, 
parametric study are as follows: 
Metacontrast masking is a U shaped 
function of At, maximum suppression 
occurring in the region of At= —100 
msec. Metacontrast increases with 
increasing luminance or duration of 
the masking stimuli or with decreas- 
ing duration of the test flash. Mask- 
ing of the center target decreases 
rapidly as the flanking stimuli are 
spatially removed from it. The rela- 
tively short distance (approximately 
1°) over which metacontrast effects 
can be induced is not increased by 
increasing the luminance of the induc- 
ing flashes (Alpern, 1954). Finally, 
although Stigler (see Alpern, 1952) 
and Baumgardt and Segal (1942-43) 
had previously demonstrated inter- 
ocular induction of metacontrast, 
Alpern was himself unable to repro- 
duce this effect. 

Fehrer and Raab (1962) showed 
that the decrease in brightness of a 
target masked by metacontrast was 
not accompanied by an increase in 
RT to the target. Normal RTs (ap- 
proximately 160 msec.) could be 
evoked by flashes which were phe- 
nomenally absent. Indeed, under 
appropriate circumstances, subjects 
can respond with normal latency toa 
stimulus whose probability of detec- 


tion is near chance (Fehrer & 
Biederman, 1962). 

Piéron (1935b) flashed four adja- 
cent targets successively and showed 
that retroactive masking of one 
stimulus diminished its brightness, 
but did not interfere with its ability 
to mask a prior stimulus. Piéron 
(1935a, 1935b) also called attention 
to the significance of metacontrast in 
the perception of moving targets; in 
particular, Fróhlich's (1929) illusion 
was interpreted in terms of "erasure" 
rather than of Empfindungszeit. Al- 
pern (1953) has, however, noted the 
flaws in Piéron’s analysis. 

With asynchronous presentation of 
adjacent black figures (on lighter 
backgrounds), there is also masking 
of the prior stimulus (Kolers, 1962; 
Kolers & Rosner, 1960; Werner, 
1935). In these disc-ring experiments, 
in which the disc (target) and ring 
(mask) have equal luminance and 
equal contrast, detection of the disc 
is a U shaped function of At (Kolers, 
1962; Kolers & Rosner, 1960), and 
interocular (**dichoptic") suppression 
is readily obtained (Kolers & Rosner, 
1960; Werner, 1940). If the target 
stimulus is severely reduced in con- 
trast, however, the detection-At func- 
tion becomes monotonic; the disc be- 
comes increasingly difficult to detect 
as At approaches zero (Kolers, 1962). 

With flashes of light delivered to 
the dark-adapted eye, the shape of 
the masking-At function can also be 
changed by changing the test-to-mask 
contrast (luminance) ratio. When 
target and mask have approximately 
equal luminance, brightness of the 
target is a U shaped function of A 
(Fehrer & Smith, 1962). If the 
luminance of the target is markedly 
reduced from that of the masks—as 
is required for the measurement O 
thresholds—simultaneous contrast 
becomes operative and masking de- 
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creases monotonically with increasing 
Ai (Fehrer & Smith, 1962; Kietzman, 
1962; Kolers, 1962). 

With test stimuli that are pat- 
terned, masking may still occur, al- 
though less readily (Toch, 1956; 
Werner, 1935). If the test stimulus 
consists of several circles (Werner, 
1935) or of several letters (Averbach 
& Coriell, 1961), metacontrast can be 
used to erase selectively one or an- 
other of these elements. 


Crawford Effect 


In 1947, Crawford reported the re- 
sults of a study of short-term foveal 
light-adaptation, in which he pre- 
sented a .5?, 10-msec., circular test 
flash before, during, and after illumi- 
nation of a concentric, 12? condition- 
ing stimulus (duration — 524 msec.). 
Of interest here is Crawford's finding 
of elevated test flash thresholds when 
the conditioning stimulus followed 
the probe by as much as 100 msec. 
(At —100 msec.). Backward mask- 
ing of a small probe stimulus by a 
larger one as a function of relative 
luminance had earlier been studied by 
Piéron (1925) and by Monjé (1928) 
in efforts to measure ‘‘sensation- 
time." 

Actually, Monjé’s study was an 
attempt to compare two different 
procedures for measuring Empfind- 
ungszeit: the method of Fröhlich 
(1929) and of Hazelhoff and Wiersma 
(1924), which is based on lags in the 
perception of moving targets; and the 
two-flash masking method of Exner 
(1868), as modified by Kunkel (1874). 

The last 100 years have seen many 
studies of "the lag of visual sensa- 
tion." As long ago as 1885, however, 
Cattell noted that investigators often 
confused the minimum duration of a 
stimulus required for maximal sensa- 
tion with the minimum interval be- 
tween two stimuli at which the first 
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(weaker) is not masked by the second 
(more intense). Obviously, the first 
determination is related to the growth 
of brightness with stimulus duration 
(see the discussion of the Broca- 
Sulzer effect), while the second con- 
cerns the latency of sensation as 
affected, for example, by stimulus in- 
tensity. 

The monotonic rise in test flash 
threshold as asynchrony is decreased 
from At— —100 msec. to At—0 has 
been studied as a function of such 
conditioning flash parameters as lu- 
minance (Battersby & Wagman, 
1959; Crawford, 1947; Kandel, 1958), 
duration (Battersby & Wagman, 
1959), and size (Battersby & Wag- 
man, 1962; Kandel, 1958). The 
Crawford effect is also seen when the 
test and conditioning stimuli are 
colored flashes (Bush, 1955) or are 
black figures on white backgrounds 
(Kolers, 1962). With scotopic vision, 
the masking effect is greater (Kandel, 
1958) and extends to At — — 200 msec. 
(Boynton & Triedman, 1953). 

Sperling (1960b) has shown that, 
in the region near At —0, if the lumi- 
nance of the masked probe is in- 
creased, the flash become detectable, 
but what is seen is a negative image 
of the stimulus. 

Boynton, Bush, and Enoch (1954) 
found that similar backward masking 
functions were generated when veil- 
ing flashes and glare flashes were used 
as conditioning stimuli. 

Baumgardt and Segal (1942-43) 
used test and conditioning flashes of 
equal duration (10 msec.) and equal 
luminance, and showed backward 
masking of suprathreshold bright- 
ness. 

Boynton considered the Crawford 
masking effect to be caused by “the 
burst of on-activity associated with 
the onset of the conditioning stimu- 
lus" (Boynton, 1961, p. 747). In sup- 
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port of this view is the finding that a 
preadapting light reduces the amount 
of masking produced by a condition- 
ing flash, presumably by attenuating 
the physiological on-response evoked 
by the conditioning stimulus (Boyn- 
ton, 1958; Boynton & Kandel, 1957). 
If, in addition, the brightness of a 
brief (conditioning) flash is given by 
the magnitude of the on-response 
(see Adrian, 1928; Katz, 1959), then 
conditioning stimuli of equal bright- 
ness should have equal backward 
masking effects. This prediction was 
confirmed: When flashes of unequal 
luminance were made to appear 
equally bright by trading luminance 
for duration, these stimuli then pro- 
duced identical backward masking 
functions (Boynton & Siegfried, 
1962). In a subsequent experiment, 
flashes of unequal luminance—made 
equally bright by differential pre- 
adaptation—were again shown to 
have equal masking effects (Onley & 
Boynton, 1962). 

When the conditioning stimulus is 
presented to one eye and the test 
flash to a homologous position in the 
other eye, backward masking of the 
latter is still found (Battersby & 
Wagman, 1962; Kandel, 1958; Wag- 
man & Battersby, 1959). There is 
much less masking with such hap- 
loscopic stimulation—the greatest 
threshold shift is only about 5 db.— 
and the amount of suppression is 
more or less independent of condi- 
tioning flash luminance. The finding 
of some central masking is given fur- 
ther support by the differential re- 
sults of monocular and interocular 
tests of patients with chiasmal and 
with retrogeniculate lesions (Bat- 
tersby, Wagman, Karp, & Bender, 
1960). 

These results stand in contrast to 
the finding of no interocular masking 
with complete, steady-state light- 
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adaptation of one eye (Crawford, 
1940; Kandel, 1958). The threshold 
for a monocular flash is affected by 
contralateral excitability, but only 
when the latter is changing (Boyn- 
ton, 1961, p. 748). 

If the test flash is patterned or con- 
toured, the informational require- 
ments of the masking experiment are 
somewhat altered; a response of 
recognition rather than of detection 
is required. Kietzman (1962) has 
called attention to this difference be- 
tween studies of ‘‘perceptual blank- 
ing" and studies of the Crawford ef- 
fect, and has shown, in addition, 
that, with comparable procedures, 
there is less masking of test flashes 
that are “informational.” 

Baxt (1871) was the first to study 
retroactive masking of patterned 
stimuli, In subsequent studies, 
usually employing tachistoscopes to 
vary stimulus durations and asyn- 
chronies, the test stimuli have been 
printed letters (Cattell, 1885; Sper- 
ling, 1960a), geometric patterns ` 
(Humphrey, Dawe, & Mandell, 1955; 
Kietzman, 1962), or contoured areas 
(Cheatham, 1952; Lindsley & Em- 
mons, 1958; von Noorden & Burian, 
1960). 


SUBLIMINAL INFLUENCES 


The perceptual blanking paradigm 
can be changed so that both stimulus 
flashes are informational in charac- 
ter. With appropriate adjustment of 
stimulus durations and asynchrony, 
the pattern flashed first is not seen. 
Several studies have shown, however, 
that the first (masked) stimulus can 
influence perception of the masking 
stimulus. Smith and Henriksson 
(1955) reported that the phenomenal 
appearance of a square can be dis- 
torted by prior presentation of a fan- 
shaped pattern of lines. Mispercep- 
tions were also obtained in a subse- 
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quent experiment in which the stim- 
uli were line drawings of real objects 
(Smith & Henriksson, 1956). Draw- 
ings of human faces have been re- 
ported to be differently perceived as 
they are preceded by the words 
"happy" or "angry" (Smith, 1957) or 
by representations of male or female 
genitalia (Klein, Spence, Holt, & 
Gourevitch, 1958). 

Similar procedures have been used 
to influence cognitive processes. 
Kolers (1957) has reported that pres- 
entation of the masked solution to a 
problem increased the probability of 
discovery of that solution. 

Goldiamond (1958) has analyzed 
studies which report the existence of 
subliminal influences on behavior. 
Such studies, he notes, are contami- 
nated by their use of two different 
response indicators—one to establish 
a limen, the other to demonstrate the 
subliminal effect, Comparison of the 
indicators usually used shows them 
to be capable of generating precisely 
those “discrepant” (or “asynchro- 
nous") psychophysical results that 
constitute the subliminal effect. Gol- 
diamond's comparative analysis of 
indicators, it should be noted, applies 
with equal force to the case of a stim- 
ulus at (or below) its masked thresh- 
old. In a certain sense, there are 
subliminal effects; in no sense are 
they mysterious. 


DISCUSSION 


Backward masking has been 
studied in experiments which have 
utilized visual, auditory, and cuta- 
neous stimulation. These studies 
have shown that perception of a tar- 
get pulse is interfered with if addi- 
tional (masking) stimulation closely 
follows the target. Immediately after 
its cessation, a brief stimulus is ex- 
tremely vulnerable to masking by a 
second stimulus. This vulnerability 


entirely after an interval that is 
the order of .1 sec. 

If these findings are considered to- 
gether with similar resulte from 
studies of forward , One can 
conclude that unhindered perception 
of a target requires that it be "sur- 
rounded in time"—i.e., preceded and 
followed —by zones devoid of mask- 
ing energy. Backward masking and 
forward masking are thus seen to in- 
volve extensions into the time do- 
main of figure-ground relations that 
are usually considered to be spatial in 
character. 


These notions can be cast in 
another form by referring to the 
source of the temporal inhibition. 
Thus, a (masking) stimulus, which is 
itself perceived, is temporally sur- 
rounded by regions of inhibition. 
Békésy (1959, 1960) has described 
the spatial "refractory aréa” which 
surrounds the area of sensation pro- 
duced by a stimulus. + 

Since masked and masking ener- 
gies must be delivered within about 
.1 sec. of each other for backward in- 
hibition to occur, the target and 
masking stimuli are usually pulses or 
bursts. Important in the perception 
of such brief stimuli are the neural 
responses associated with stimulus 
onsets (Boynton, 1961). Indeed, the 
Broca-Sulzer effect, which concerns 
the brightness of single brief flashes, 
has been related to changes in the 
average frequency of optic impulses 
as a function of flash duration 
(Adrian, 1928; Katz, 1959). It is 
puzzling, however, to find no similar 
auditory effects in the growth of 
loudness with stimulus duration 
(Miller, 1948; Small, Brandt, & Cox, 
1962). 

Although the experimental results 
of two-pulse studies of backward 
masking are in general agreement 


126 


with each other, the task remains of 
identifying the neural mechanisms 
mediating retroactive masking of one 
stimulus by another. The problem is 
complicated by the fact that there 
are both peripheral and central con- 
tributions to the masking phenome- 
non. Although Ratliff, Hartline, and 
Miller (1962) have recently described 
some oí the temporal parameters of 
lateral inhibition in the Limulus 
eye, direct measurements of the neu- 
ral events accompanying backward 
masking are lacking. As a result, 
most of our notions about neural 
mediating mechanisms are inferences 
from purely psychophysical studies. 
Piéron (1925) and Crawford (1947) 
proposed that, with the masking 
stimulus more intense than the probe, 
differences in neural latency could 
offset asynchrony of stimulation (A?) 
and provide for some kind of inhibi- 
tion of the responses evoked by the 
test pulse. This inhibition was con- 
sidered by Cheatham (1952) to result 
from cortical “satiation” (see Kéhler 
& Wallach, 1944). 

That the inhibition is not complete 
—at least in the case of metacontrast 
—is shown by the results of RT 
measurements (Fehrer & Biederman, 
1962; Fehrer & Raab, 1962). The 
masked test stimulus may not be 
seen as such (or at all!), but it can 
trigger responses of normal latency. 
The studies of subliminal influences 
(reviewed in the preceding section) 
suggest, furthermore, that the com- 
bination of test and masking stimuli 
is differently perceived from the 
masking stimulus alone. The neural 
representation of the test stimulus, it 
would seem, is not totally occluded 
by the masking pulse. 

Perception of the probe stimulus 
as such, forced-choice discrimination 
of mask from mask-plus-probe, re- 
sponding (in a simple RT experi- 
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ment) to probe-plus-mask—these are 
informationally different from each 
other. Presumably, the neural events 
required for these responses to occur 
are also different from each other. 
Until these neural events are directly 
observed, however, knowledge about 
backward masking will continue to 
be based on the results of psycho- 
physical studies. It seems worthwhile 
to broaden the scope of these studies 
by investigating “analogous” phe- 
nomena in several sense systems and 
by extending the range of stimulus 
variables and indicator responses. 
For the present, it seems very prema- 
ture to exclude from consideration 
purely "perceptual" accounts of 
backward masking (Battersby & 
Wagman, 1962, p. 364). Psycho- 
physical and neurophysiological re- 
searches are both needed. 
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A selective review of the literature suggests that the same major features 
of behavior pathology appear in both the clinical studies of human dis- 
orders and experimentally produced behavior disorders in animals. In 
the clinical area the 3 major features are: presence of an intense anxiety 
reaction, development of stereotyped and repetitive symptoms, and 
fixation of needs and emotions at an immature level. Studies of experi- 
mental neurosis reveal that acute anxiety and formation of stereotyped, 
repetitive symptoms are characteristic of this area as well, and related 
studies point to fixation as a consequence of infantile frustration. It 
would appear that the same principles control behavior pathology in 


several species and are applicable in a wide range of situations. 


Investigation of behavior pathol- 
ogy has a long heritage, but perhaps 
two significant trends can be nomi- 
nated as having substantially in- 
creased our understanding of func- 
tional disorders since the turn of the 
century. On the one hand, the advent 
of psychoanalysis laid the foundation 
for a dynamic theory of human 
pathology, and in a broader sense it 
recast the traditional conceptions of 
personality into a more kinetic form. 
Emotion, conflict, and anxiety re- 
ceived increasing emphasis as basic 
operators in human behavior, and 
disturbances in emotional relation- 
ships became the focal point for 
interpreting behavior disorders, The 
theory was fashioned from data 
secured in the therapeutic treatment 
room, but as child psychiatry and 
clinical psychology developed, the 
domain of data expanded markedly 
and broadened the empirical frame- 
work on which the theory rested. 

The transition from an organic 
view of pathology to a dynamic 
functional view was of inestimable 
significance for treatment and diag- 


1 This paper has been substantially 
strengthened by the comments and criticisms 
of A. L. Benton and A. B. Heilbrun, and I 
am privileged to record my indebtedness to 
them. 
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nosis. Nevertheless, the concepts 
introduced to account for patho- 
logical phenomena were not always 
clearly defined or open to verifica- 
tion; and some pointed criticisms 
were raised about the subjective bias 
of both patient and analyst that 
might enter into the interpretation of 
data. 

The second major trend is of some- 
what more recent vintage, dating 
from Pavlov's investigation of condi- 
tioned reflexes, and it can be referred 
to as the experimental investigation 
of behavior pathology. It is repre- 
sented by the work of Gantt and 
Liddell on experimental neurosis, by 
Masserman’s work on conflict, Maier 
on frustration, and the work of Solo- 
mon and his colleagues on traumatic 
avoidance learning. Although the 
conceptual schemata differ, these 
experiments share common properties 
in terms of the behavior disturbances 
produced, and within the limits o 
design differences they generally cor- 
roborate one another. As will sub- 
sequently become evident, the major 
features of pathology revealed in 
these studies also correspond t? 
several important features of human 
pathology as seen in the clinic. Yet 
this body of research, which clearly 
satisfies the criteria of being objective 
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and controlled, has not been em- 
braced by clinicians, mainly because 
its significance for human pathology 
is somewhat obscure; and the experi- 
mentalists themselves have made 
only nominal efforts to integrate 
their results with clinical data. 

The purpose of the present essay is 
to focus upon points of common 
agreement between the clinical and 
experimental areas, and to suggest 
that the same principles apply to 
both areas. To this end a selected 
group of clinical studies will be re- 
viewed in an effort to establish some 
basic parameters that cut across the 
traditional categories of behavior 
pathology. Subsequently the experi- 
mental literature will be examined for 
concepts and empirical laws that will 
anchor our clinical concepts more 
securely. The final section will touch 
briefly on the implications of this 
review for the analysis of behavior 
pathology. 


CLINICAL STUDIES OF HUMAN 
PATHOLOGY 


Kubie on Neurosis 


In an early paper Kubie (1941) 
analyzed the characteristics of neu- 
rotic behavior in search of a general 
principle that would unify the com- 
monly recognized symptoms. He 
proposed the principle of repetitive- 
ness, or more particularly, that dis- 
tortions in the normal process of 
repetitiveness constituted the core of 
neurotic behavior. Kubie argued 
that the organization of the central 
nervous system provides for sus- 
tained impulses through the opera- 
tion of open and closed circuits; con- 
sequently, the psychological develop- 
ment of the organism is rooted in 
repetition of experience. Motivated 
by diffuse tension, the infant responds 
with random efforts which gradually 
evolve into more economical forms 
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until they finally become specifically 
goal directed. The acquisition of 
skills depends upon endless but flex- 
ible and spontaneous repetition of 
motor activity. 

While these skills are developed 
primarily to relieve states of tension, 
they soon acquire secondary meaning 
as rewarding activities in their own 
right. Functions such as walking, 
talking, manipulating objects, ex- 
ploring and mastering new situations 
are practiced time and again because 
the child is highly gratified by the 
exercise of these new functions. Great 
emotional significance attaches to 
these skills, either of delight and 
satisfaction in the case of uninter- 
rupted practice, or of tension and 
rage where such activities are inter- 
fered with. 

Kubie believes that frustration 
intervenes most markedly at this 
point, for the repetitive behavior of 
the child may stimulate inhibitory 
controls from his parents. When 
punishment or threats are applied to 
curtail such behavior, the child re- 
sponds with rage and temper tan- 
trums. If the parents vigorously 
suppress this outlet also, the stage is 
set for severe conflict. 

It is at this point that the shift 
from normal flexible repetitiveness to 
rigid neurotic repetitiveness takes 
place. Successive expressions of the 
need are no longer modified by reward 
or punishment, but are cast rigidly as 
the only possible compromise solu- 
tion of all the child feels in the con- 
flict situation. Consequently, the 
repetitive act becomes irresistible to 
the child, and it displays a rigid 
intensity that eliminates flexible prob- 
lem solving behavior. 

Ina later paper, Kubie (1954) sums 
up the basic distinction between 
normal and neurotic behavior: 


[Normal] patterns of behavior, no matter how 
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varied they may be, will have one basic char- 
acteristic in common, namely that any 
repetitiveness which that behavior may ex- 
hibit with respect to impulse, thought, action 
or feeling, or any combination of these, will be 
flexible, modifiable, satiable.... [Neurotic 
behavior] will have precisely opposite char- 
acteristics; it will be repetitive, obligatory, in- 
satiable, and stereotyped (pp. 202-203). 


Alexander and French (1946) are 

also persuaded that repetitive be- 
havior is a prominent feature of 
neurosis. | Drawing on extensive 
therapeutic contacts with neurotic 
patients, they summarize the basic 
problem as follows: 
In normal development, patterns from the 
past undergo progressive modification. One 
learns from experience by correcting earlier 
patterns in the light of later events. When a 
problem becomes too disturbing to face, how- 
ever, this learning process is interrupted and 
subsequent attempts to solve the problem 
must, therefore, assume the character of 
stereotyped repetitions of previous unsuc- 
cessful attempts to solve it. A neurosis may be 
defined as a series of such stereotyped reac- 
tions to problems that the patient has never 
solved in the past and is still unable to solve in 
the present (p. 76). 


Frustration and Schizophrenia 


In behavior disturbances more 
severe than neurosis, similar tenden- 
cies toward rigid repetition of certain 
acts, regardless of their consequences, 
have been noted. 

Jenkins (1950, 1952) has proposed 
that frustration carried beyond the 
tolerance level of the individual 
stimulates disorganization, with- 
drawal, and stereotypy. Drawing on 
Maier’s (1949) experimental work, 
Jenkins attributes the schizophrenic 
process to profound frustration that 
arises chiefly in the area of interper- 
sonal relationships. After repeated 
rebuffs, the schizophrenic gradually 
withdraws from emotional contact 

with other people and dwells more 
and more in the realm of fantasy. 
Efforts to establish rewarding rela- 
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tionships are gradually replaced by 
regressive, stereotyped responses that 
further aggravate the problem. 

Jenkins (1950) finds considerable 
support for his position in the pub- 
lished clinical literature, much of 
which assigns a prominent role to 
early frustrating experiences as a 
causal factor in schizophrenia. Stud- 
ies of the schizophrenic’s family 
often disclose an overpowering 
mother who is described by such 
adjectives as perfectionistic, dominat- 
ing, aggressive, overanxious and over- 
protective—the type of mother who 
markedly interferes with the child's 
growth toward independent selfhood. 
The point is illustrated by a sample 
of statements chosen more frequently 
by mothers of male schizophrenics 
(Mark, 1953). 

1. A mother should make it her 
business to know everything her 
children are thinking. 

2. Children should not annoy par- 
ents with their unimportant prob- 
lems. 

3. A watchful mother can keep her 
child out of all accidents. 

4. A devoted mother has no time 
for social life. 

5. Playing with a child too much 
will spoil him. 

6. A mother has to suffer much 
and say little. 

7. Children who take part in sex 
play become sex criminals when they 
grow up. 

8. Too much affection will make a 
child a ‘‘softee.”’ 

Jenkins (1952) reasons that perva- 
sive control measures invade the day- 
to-day experience of the child 
throughout a wide range of situations 
and “make it more than usually diffi- 
cult for a child to maintain a sense of 
his individuality, except in the autis- 
tic withdrawal of fantasy” (p. 740). 
Frustration at this stage interferes 
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with the development of effective 
coping mechanisms and forces the 
child into regressive, stereotyped 
patterns of behavior. 

A more phenomenological analysis 
of schizophrenia is offered by Gun- 
trip (1952), who states that the pri- 
mary danger for psychological de- 
velopment lies in early object-rela- 
tionships that are frustrating for the 
child. When the mother is cross, 
impatient, and punitive with her 
child, or is emotionally detached and 
unresponsive, the child experiences 
such behavior as frustrating his most 
important needs, Consequently the 
mother becomes a bad object, and 
"an inner psychic world is set up: is 
in which one is tied to bad objects 
and feeling, therefore, always frus- 
trated, hungry, angry and guilty, and 
profoundly anxious" (p. 348). 

Guntrip argues that a bad object 
cannot simply be dismissed. The 
most primitive reaction to early 
deprivation is to become patho- 
logically attached to the object, and 
to continually rehearse these írus- 
trating experiences in fantasy in an 
effort to make them turn out posi- 
tively. They never do, though, and 
the schizophrenic remains fixated at a 
primitive level of emotional develop- 
ment, intensely preoccupied with 
problems of nurturance and support. 
He senses his own needs as being 
overwhelming and  all-consuming, 
capable of exhausting the resources of 
anyone offering à supportive relation- 
ship. By the same token, the schizo- 
phrenic is acutely fearful of being 
rejected or exploited by a poten- 
tially gratifying object. He is there- 
fore repetitively drawn into relation- 
ships offering support, but once es- 
tablished, the schizophrenic finds 
them too threatening to be main- 
tained. He is terrorized by the pros- 
pect of a relationship which he per- 
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his emotions are so poorly controlled 
that he is in constant danger of being 
overwhelmed by tension. 

Set aside the mentalistic overtones 
of Guntrip's argument and it is evi- 
dent that he is pointing in the same 
direction as Kubie and Jenkins, 
Frustration and deprivation in severe 
degree interfere with the normal 
course of development, and pathology 
is reflected in repetitive patterns of 
behavior and thought, in extreme 
tension levels, and in a freezing of 
emotional development, where needs 
of historical significance continue to 
plague the individual long beyond 
the stage they are appropriate. There 
is a loose consensus here about the 
important features of human pathol- 
ogy, a grouping that is amenable to 
further investigation. Since the fore- 
going studies have cast childhood 
frustration in a’ principal role, we 
turn now to a series of reports on 
behavior pathology in children, to see 
if disturbances at an early age are 
expressed in the same way. 


BEHAVIOR PROBLEMS IN CHILDREN 


Erikson (1940, 1953) has observed 
that repetitive sequences in a child's 
play activities are often traceable to 
conflicts being expressed with the 
toys. He suggests that play serves 
for the anxious child the same func- 
tion as talking over problems or 
vicariously rehearsing them does for 
adults: it provides a limited sphere 
somewhat removed from the conflict 
situation in which the central fea- 
tures of the problem can be replayed, 
examined, and alternatives evalu- 
ated. But even here anxiety may 
intercede if the play activities too 
closely parallel the real life conflict. 
When the problem is of central sig- 
nificance to the child yet he cannot 
resolve it, play activities assume a 
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repetitive, intense nature, inevitably 
leading to some emotional dead end 
and consequent play disruption. The 
problem may be defensively adjusted 
through unconscious transformations 
to avoid outright recognition, but its 
repetitive expression in play testifies 
to the position of prominence it oc- 
cupies in the life of the child. 

More serious behavior disturbances 
in children have been investigated by 
Bettelheim (1950) and by Redl and 
Wineman (1951, 1952). Bettelheim's 
children are notable for their with- 
drawal, their autistic reconstruction 
of reality, for serious problems with 
such fundamental processes as eating, 
elimination, and sleep; and they 
exhibit a host of repetitive behaviors 
that are heavily colored with sym- 
bolic significance. The prelude to 
these difficulties is suggested by the 
social history of the children, where 
deprivation and rejection are recur- 
rent themes. 

Perhaps the most significant change 
during treatment is the gradual free- 
ing of needs and impulses that hereto- 
fore had been drastically inhibited. 
In the supportive atmosphere of the 
treatment center, where no restric- 
tions are placed on regressive be- 
havior, the child may experiment 
with indulging his primitive needs. 
If the problem touches on nourish- 
ment and security, as it does for most 
of these children, regressive behavior 
with food occupies a prominent role. 
Socialized eating habits are dispensed 
with in favor of manually stuffing the 
mouth full of food. Demands to be 
spoon fed or nursed from a bottle are 
not uncommon. 

Yet the reactivation of these needs 
brings about a serious anxiety reac- 
tion that the child cannot cope with. 
It is particularly disruptive because 
his history is notably deficient of ex- 


periences wherein some behavior on 
his part was successful in relieving 
tension. Having only a limited reper- 
toire of coping mechanisms, the child 
is overwhelmed by the strength of his 
impulses and he fears losing control 
of himself. As insurance he may insti- 
tute compulsive rituals to protect 
himself from anxiety. 

Behavior gradually becomes more 
flexible and reality-oriented as the 
child avails himself of unrestricted 
gratification. His emotional reactions 
are updated to conform to present 
circumstances, rather than being 
dominated by past experiences of 
frustration. Hecan enlarge his sphere 
of interests and more importantly, he 
develops inner controls to initiate 
and modify behavior in adaptive 
fashion—a series of coping mecha- 
nisms to operate on the environment 
and regulate impulse expression. In- 
stead of being passively over- 
whelmed, the child now participates 
actively in growth experiences that 
promote a sense of confidence in his 
ability to manage his life. 

The emphasis here upon inner con- 
trols recalls the point made by Jen- 
kins and Kubie about the develop- 
ment of mastery skills—how these 
play a fundamental role in adjust- 
ment. Where circumstances combine 
to interfere with their growth, the 
child is seriously handicapped in his 
transactions with the environment 
and in the management of his im- 
pulses. The Pioneers of Redl and 
Wineman, to whom we turn next, 
also illustrate the point but with the 
unique twist of having a few mastery 
skills, or ego functions, overworked 
in the service of defense. 

The children chosen for treatment 
by Redl and Wineman (1951, 1952) 
were highly aggressive and destruc- 
tive, characterized by serious defi- 
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ciences in behavior control. These 
delinquents were unable to handle 
reasonable amounts of tension with- 
out becoming disorganized. Fear, 
excitement, guilt, recall of past 
memories—even in minor doses these 
events sufficed to overwhelm the 
control system and stimulate violent 
acting out. The ego functions of ap- 
praisal, control, and delay were 
quickly swamped by unmanageable 
tension and the child's behavior ex- 
hibited regressive, stereotyped char- 
acteristics, 

But in sharp contrast to their help- 
lessness in coping with internal ten- 
sion, the Pioneers exhibited a set of 
shrewdly developed defenses that 
protected their gratification outlets 
and insulated them from the implica- 
tions of their behavior. It is exempli- 
fied by the delinquent's attempt to 
provoke restrictive, punitive action 
from adults, thereby justifying his 
belief that he is persecuted and is en- 
titled to express his hatred and ag- 
gression against the persecutors. Dis- 
trust of adults is strongly rooted in 
early experiences of frustration and 
rejection, and techniques to close off 
interference from that quarter, to 
minimize potential danger to im- 
pulse expression, have been sharp- 
ened through a long history of war- 
fare with a hostile environment. Con- 
currently, self-protective mechanisms 
develop as an armor against recog- 
nizing personal responsibility for the 
behavior in question. Thus fortified, 
the delinquent shrewdly gears his 
behavior to maintain free license and 
justify his delusional belief that all 
adults are out to get him. But the 
defensive nature of these activities is 
disclosed by their rigid repetitiveness 
even in the benign atmosphere of the 
treatment home, and by the appear- 
ance of regressive demands for grati- 
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fication once a positive relationship 
with an adult is established. 


Natural Experiment in Adult Frustra- 
tion 


The discussion thus far has em- 
phasized the effects of frustration 
during early development. However, 
the same mechanism is conceived to 
be operative under conditions of 
Stress at all stages of maturity, al- 
though during adulthood the effects 
on well established behavior patterns 
may be less marked. In particular, 
stereotypy of behavior is expected as 
an outgrowth of extreme tension, as 
well as a progressive breakdown in 
the more highly refined behavior con- 
trols. 

Hinkle and Wolff (1956) have im- 
pressively documented this process in 
their study of Communist indoctri- 
nation techniques. Analyzing the 
prisoner's experience in the hands of 
the Communist police, Hinkle and 
Wolff emphasize that he is confronted 
with a continuous series of frustra- 
tions. They compare the indoctrina- 
tion procedure with experimental 
studies of frustration and observe 
that the reaction of the prisoner is 
basically similar to that of the experi- 
mental subject, with the exception 
that the prisoner’s reaction is more 
all-embracing and devastating. The 
sequence of behavior following im- 
prisonment runs as follows: purpose- 
ful exploratory activity; random ex- 
ploration, with a general increase in 
motor activity; excitement, anxiety, 
hyperactivity; gradual subsidence of 
activity, with the exception of iso- 
lated repetitive acts. Such acts are 
endlessly repeated although they can 
never provide a solution. If pressure 
is continued long enough, the ulti- 
mate response is one of total inactiv- 
ity, accompanied by strong feelings 
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of dejection. The prisoner is un- 
usually receptive to approval or 
human support (adapted from Hin- 
kle & Wolff, 1956, p. 160). 

'The prison situation is unique in 
the degree to which it interferes with 
the biological and social routines of 
the prisoner's life. In addition, the 
prisoner is subjected to repeated in- 
terrogations that play upon his emo- 
tional weak points and constantly 
pressure him to compromise his posi- 
tion on an issue he may not clearly 
understand. Effective use is made of 
stress, although seldom in the form of 
outright torture, until the prisoner's 
resistance eventually decays. Be- 
havior becomes more primitive and 
psychological withdrawal accom- 
panies the development of stereo- 
typed responses, The entire process 
may be understood as a reaction to 
acute and unremitting tension. 


Common Parameters 


This brief survey of clinical re- 
search discloses three significant fea- 
tures that seem to cut across all 
classes of behavior pathology. There 
is on the one hand a rigid, intense 
manner of expressing symptomatic 
behavior, no matter what the content 
may be. Symptomatic behavior may 
be understood as a compromise 
activity that has been crystallized by 
its success in relieving tension, al- 
though it is demonstrably ineffective 
in securing need satisfaction. As an 
activity it pursues a stereotyped 
course and is relatively indifferent to 
control through reward or punish- 
ment. In form and function the 
symptoms may mirror a behavior 
pattern of historical significance, now 
no longer appropriate; they may ex- 
press a conflict symbolically, or they 
may include postural and motor ad- 
justments that bear no discernible 
relationship to the problem at hand. 
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In the second place, needs and emo- 
tions operative at the time of frustra- 
tion seem to be fixated, and they fur- 
nish the individual with residual ten- 
sions that are chronologically out of 
step with his development in other 
areas. It is obvious that much sur- 
plus and subjective meaning attaches 
to the terms “need” and "emotion"; 
nevertheless, they are roughly de- 
scriptive of an internal state of 
affairs that exercises pressing control 
on the individual's behavior. The 
adult who is described as an oral 
character still maintains certain in- 
terests appropriate to an earlier phase 
of development, and the gratifications 
he seeks are thinly disguised hold- 
overs from this period. The steady 
progression to mature, differentiated 
forms of emotional expression and 
impulse control is interrupted, and 
old problems of historical significance 
continue to bias all contemporary 
relationships. The individual cannot 
escape the past, and his techniques 
for coping with the environment like- 
wise remain at a primitive, immature 
level. 

The third important feature of be- 
havior pathology is the presence of 
an intense anxiety reaction, and the 
manifold changes in behavior pro- 
duced by anxiety. Due to its com- 
pelling drive properties, anxiety 
forces the individual into response 
patterns that ward off or alleviate 
anxiety, regardless of their adaptive 
value for other purposes. Precision 
and control give way to disorganiza- 
tion and panic. Flexible, goal di- 
rected adjustments are disrupted 
and behavior is crystallized into a 
stereotyped pattern. Subjectively, 
an anxiety reaction is accompanied by 
feelings of overwhelming dread and, 
terror that are unpleasant in the ex- 
treme for the individual. His be- 
havior is then dominated by primi- 
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tive attempts to terminate the anx- 
iety reaction and to ward off future 
attacks at all costs. These attempts 
will generally include internal de- 
fensive operations that process 
threatening thoughts or memories 
out of awareness, as well as the forma- 
tion of stereotyped symptoms that 
forestall anxiety. Where anxiety is 
severe enough or is chronically sus- 
tained, it forces drastic changes in 
behavior of a pathological nature. 
From this viewpoint, anxiety appears 
to be the common denominator that 
underwrites the major features of 
behavior pathology. 

These clinical phenomena serve as 
à basic point of reference for inter- 
preting behavior pathology. But we 
are hindered at this stage by a cer- 
tain looseness of terminology and 
concept, a vagueness about the exact 
nature and operation of these phe- 
nomena. The experimental literature 
on behavior pathology has made 
substantial inroads in this direction, 
and a survey of some of these studies 
may help to clarify the points in 
question. 


EXPERIMENTAL ÍNVESTIGATIONS 
ór BEHAVIOR PATHOLOGY 


The transition from the clinic to 
the laboratory reveals some abrupt 
changes in design and procedure, as 
well as a shift from human to animal 
subjects. We shall be principally 
concerned with three overlapping 
categories of research that offer 
powerful concepts for interpreting 
human pathology. They're: frustra- 
tion, traumatic avoidance learning, 
and experimental neurosis. While the 
procedures differ markedly in each 
‘case, the results uniformly reveal 

. serious disturbances in behavior. A 
* brief review of these studies may 
serve to illustrate the basic conditions 
that give rise to behavior pathology. 
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Frustration and Response Fixation 
The most extensive work on frus- 
tration has been carried out by 
Maier, whose theory and experiments 
were reported originally in his 1949 
monograph, and the theory has been 
extended in a more recent publica- 
tion (Maier, 1956). The basic appara- 
tus Maier uses is the Lashley jump- 
ing stand and the problem on which 
the animal is trained is a discrimina- 
tion between two visual patterns. 
Once a discrimination is established, 
frustration is introduced by locking 
the two doors randomly, so that 
neither a position response nor a dis- 
criminated pattern response leads to 
reward more than 50% of the time. 
Maier's definition of frustration 
flows directly from this procedure: - 
forcing the animal by means of an air 
blast or electric shock to respond to a 
presently insoluble problem. 

Under these conditions, the jump 
latency increases and the animal may 
interpolate several abortive jumps 
into his response pattern. The ten- 
sion under which the animal operates 
is reflected by the number of seizures 
experienced on the jumping platform. 
One response, usually a position re- 
sponse, becomes increasingly stereo- 
typed and is routinely performed on 
each trial. As the stereotyped re- 
sponse is established, less resistance 
to jumping is manifest and seizures 
decrease; apparently it provides an 
outlet for tension. Once established, 
the response continues indefinitely. 
Even when the problem is made 
soluble again the animal does not 
break the pattern, although he may 
give evidence of recognizing what the 
correct response is. Short of some 
specialized therapeutic measures, the 
animal's behavior is remarkably in- 
variant. 

Maier (1956) has introduced the 
concept of frustration threshold to 
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handle these data, suggesting that 
extreme frustration precipitates a 
sharp transition to massive and 
uniquely patterned autonomic reac- 
tions that override voluntary control. 
Maier reasons that tension pitched at 
a very high level may remove cortical 
inhibition of primitive neural mech- 
anisms and facilitate gross emotional 
discharge in the form of seizures, tan- 
trums, and rage. Tension functionally 
reverses the processes of individua- 
tion and specificity in neural control 
and pushes behavior towards more 
primitive forms. While Maier has not 
yet clearly coordinated these super- 
threshold tensions with the behav- 
ioral characteristics of frustration— 
response stereotypy, abnormal fixa- 
tions—the evidence is strongly in 
favor of some mechanism by which 
internal tension transforms normally 
variable, goal oriented behavior into 
an immutable response pattern. 


Traumatic Avoidance Learning 


We turn now to a series of studies 
that have focused explicitly upon the 
behavior changes effected under acute 
pain-fear conditions. Using electric 
shock of just subtetanizing intensity, 
Solomon and his colleagues (e.g., 
Solomon & Wynne, 1953, 1954) have 
traced the course of avoidance learn- 
ing and explored the physiological 
correlates of massive pain-fear reac- 
tions mobilized by shock. 

The apparatus is a shuttle box 
with a gridded floor, separated into 
two compartments by an adjustable 
barrier and a drop gate. The dog is 
placed in one compartment, the con- 
ditioned stimulus (CS) is presented, 
the drop gate removed, and 10 sec- 
onds later shock is administered. 
After a period of intense panic activ- 
ity, the dog scrambles over the bar- 
rier and by so doing terminates both 
shock and the CS. 
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The basic datum is the latency of 
the animal’s jump over the barrier, 
measured from the onset of the CS. 
The first few trials are escape trials, 
the animal failing to jump until 
shocked, but by the fifth trial the 
average dog has executed an avoid- 
ance response within the 10-second 
interval and therefore is not shocked. 
By definition the animal is now in 
the extinction phase, and the experi- 
ment is continued indefinitely to as- 
sess resistance to extinction. 

These animals manifest an abrupt 
shift from escape to avoidance re- 
sponses, and of greater significance, 
the jump latencies gradually decrease 
while the animal is executing success- 
ful avoidance responses. As trials 
cumulate, the animal jumps with in- 
creasing rapidity until a stable la- 
tency of about 1.6 seconds is reached. 
It should be emphasized that laten- 
cies stabilize long after the last shock 
is received, i.e., during the extinction 
phase. Solomon and Wynne (1954) 
conclude that fear has replaced shock 
as the drive, and escape from the 
fear producing CS serves to strength- 
en and move forward the jumping 
response. 

The persistence of the jumping re- 
sponse is remarkable. Animals car- 
ried through 600 or more extinction 
trials showed no sign of extinction. 
But during this period when the 
avoidance response is being precisely 
executed each time, the overt signs 
of anxiety rapidly disappear as the 
dog becomes more and more stereo- 
typed in his jumping activities. 
rather casual attitude replaces the 
acute panic reaction manifested ear- 
lier. If, however, the dog is forcibly 
prevented from jumping by means 0 
a barrier, an intense overt anxiety 
reaction develops immediately (Solo- 
mon, Kamin, & Wynne, 1953). 

Solomon and Wynne (1954) ad- 
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vance a carefully reasoned argument 
to account for their results. Their 
argument is derived from two-proc- 
ess learning theory, but with two 
additional principles: anxiety con- 
servation, and partial irreversibility 
of high intensity pain-fear reactions. 
These additions form a conceptual 
base from which protracted resist- 
ance to extinction and the apparent 
loss of overt anxiety can be derived. 
The theory offers a major inroad to 
problems of human pathology and 
will be briefly outlined here. 

Anxiety conservation. This princi- 
ple grows out of observations that 
animals appear more relaxed as the 
response latency decreases to some 
stable value around 13 seconds. 
Moreover, if an animal delays appre- 
ciably on one trial before jumping, he 
appears quite upset following the 
jump and responds very rapidly for 
the next few trials. On the strength 
of these observations, Solomon and 
Wynne suggest that the animal 
gradually establishes a stable re- 
sponse latency which is short enough 
to prevent full arousal of the anxiety 
reaction. When the CS is presented, 
a finite time lag intercedes before all 
components of the anxiety reaction 
are mobilized, and by virtue of a 
speedy instrumental response the 
animal terminates the CS and thus 
prevents full arousal of the anxiety 
reaction, Solomon and Wynne (1954) 
carefully evaluate the literature on 
latency of autonomic functioning 
and conclude that at least 2 seconds, 
perhaps longer, must elapse before 
feedback from the peripheral auto- 
nomic nervous system can appreci- 
ably affect central motor processes. 
Moreover, variations in latency exist 
for the several autonomic responses 
that constitute the anxiety reaction. 
Consequently, the intensity and 
scope of the anxiety reaction acti- 
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vated by the CS is a direct function 
of the exposure period and many ele- 
ments of the anxiety reaction are not 
aroused. 

Based on these considerations, the 
substance of the anxiety conserva- 
tion principle is: 
if nonreinforced exercise of a CS-CR relation- 
ship is the necessary condition for extinction, 
then the extinction of the associational link- 
age and at least this [the unaroused] portion 
of the anxiety reaction cannot take place. In 
one sense, the amplitude of the anxiety reac- 
tion is being conserved as a relatively intact 
~— a latent functional entity (p. 


Put another way, the animal does not 
test reality by remaining with the CS 
long enough to find that it is no 
longer followed by shock. The instru- 
mental response, established under 
extraordinary levels of pain-fear, now 
is sustained by its efficacy in prevent- 
ing full scale arousal of anxiety. So 
long as the animal can perform the 
avoidance response rapidly, he can 
control anxiety; he has, at the be- 
havioral level, the equivalent of a 
defense mechanism. But the very act 
that prevents anxiety also eliminates 
the conditions that must obtain for 
extinction to occur, namely, repeated 
arousal of anxiety within the CS 
situation but with the original rein- 
forcement absent. So anxiety con- 
tinues as a latent but nonetheless 
potent state, supporting all manner 
of avoidance activities in a situation 
that has long since ceased to have its 
former significance. 

This treatment is roughly analo- 
gous to the clinical interpretation of 
defense mechanisms. When some 
event has been associated with se- 
vere anxiety, defense mechanisms are 
instituted to prevent subsequent 
arousal of anxiety. Defense mecha- 
nisms usually process out internal 
stimuli (thoughts, impulses), but 
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they may also impose selective dis- 
tortions upon the perception of ex- 
ternal events that are threatening to 
the individual. Anxiety is thus the 
motivator of defense mechanisms, 
and at the same time it is the emo- 
tionally distressed state the indi- 
vidual avoids by dint of his defenses. 
By eliminating or disguising the in- 
ternal stimuli that have become a 
signal for anxiety, the defense mech- 
anisms successfully prevent an anx- 
iety attack, just as withdrawal from 
the CS eliminated signs of anxiety in 
Solomon's dogs. 

Partial irreversibility of intense 
pain-fear reactions. While anxiety 
may be conserved against extinction 
by a rapidly performed avoidance 
response, there are instances where 
the instrumental response either is 
not or cannot be executed quickly. A 
barrier may prevent jumping alto- 
gether, or on one trial the jump 
latency may lag below its usual stable 
value. On such occasions more 
anxiety should be aroused, and in the 
absence of pain as the unconditioned 
stimulus (UCS) the anxiety reaction 
should be fractionally reduced. Theo- 
retically, a slow and gradual loss in 
the anxiety reaction would be ex- 
pected. While extinction may be 
extended by the principle of anxiety 
conservation, it should not be perma- 
nently postponed. 

On the strength of their data and 
related literature on avoidance learn- 
ing, Solomon and Wynne (1954) be- 
lieve that it is empirically possible to 
produce avoidance responses that 
will last for thousands of trials. They 
believe that ordinary extinction pro- 
cedures will be ineffective for cases of 
severe trauma; anxiety will never be 
completely eliminated. They con- 

clude, “Therefore, there must be a 
point at which the anxiety conserva- 
tion phase is buttressed in some way; 
there must be some reason for such 
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resistance to extinction...” (p. 
361). 

The second principle, partial irre- 
versibility, constitutes the reason, and 
it means simply that when an intense 
pain-fear reaction of wide autonomic 
scope is classically conditioned to a 
CS, the stimulus is permanently in- 
vested with power to evoke a residual 
anxiety reaction. Repeated extinc- 
tion trials may depress the anxiety 
reaction, but there is a fixed thresh- 
old value beyond which normal ex- 
tinction procedures have no effect. 
Solomon thinks of partial irreversibil- 
ity as a neurophysiological phenome- 
non, reflecting a relatively permanent 
reorganization within the central 
nervous system. The change is as- 
sumed torepresent a decreased thresh- 
old of sensitivity, analogous to the 
partial reorganization of hormonal 
functioning that Selye (1950) incor- 
porates in his concept of the adapta- 
tion syndrome. 

With these two principles, Solomon 
and Wynne are able to interpret be- 
havior that is functionally impervi- 
ous to extinction. Clinicians have 
long since suspected that maladap- 
tive behavior must be controlled by 
some such principles because it per- 
sists paradoxically even though caus- 
ing distress and punishment. Of 
prime significance here is Solomon's 
Observation that punishment may 
actually strengthen rather than 
weaken the instrumental avoidance 
response. Once the jumping response 
is firmly established, shocking the 
animal for performing the response 
seems to increase anxiety more than 
it inhibits jumping. This gives rise to 
the peculiar spectacle of an animal 
squealing vigorously as the CS is 
presented, yet inexorably jumping 
into shock. If our earlier equation of 
avoidance responses with human de- 
fensive activities is valid, it becomes 
increasingly clear why punishment 
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does not eradicate anxiety-motivated 
behavior in clinical patients. 


EXPERIMENTAL Neurosis 


Behavior disturbances in animals 
have occupied a prominent research 
niche ever since Pavlov produced a 
“neurosis” in dogs who could no 
longer discriminate between positive 
and negative conditioned stimuli. 
Gantt (1944, 1953) and Liddell (1944, 
1953) are among the principal Ameri- 
can investigators using the condi- 
tioned reflex technique to study be- 
havior disturbances, and their results 
are of considerable theoretical signifi- 
cance for the problem of anxiety. 


Liddell on the Vigilance Reaction 


Liddell has experimented exten- 
sively with sheep, goats, and pigs, 
using a feeble electric current applied 
to the foreleg to condition leg flexion. 
A metronome beat is introduced as 
the CS, and after a number of pair- 
ings the CR is firmly established. 
Subsequently, a second metronome 
beat is introduced but this beat is 
never followed by the UCS. After 
repeated trials a clear discrimination 
is established and the animal does not 
flex the leg to the negative stimulus. 
The animal does exhibit a sharp alert- 
ing reaction, just as for the positive 
stimulus, but in the negative instance 
he remains tense and vigilant al- 
though no response is performed. 
Paradoxically, the mild current ap- 
plied following the positive stimulus 
produces relaxation and an abrupt 
decrease in tension. 

By steadily converging the two 
metronome beats, the animal is re- 
quired to make finer and finer dis- 
criminations until the threshold is 
passed. At this point the animal re- 
sponds erratically, former discrimina- 
tions are lost, and behavior distur- 
bances appear. The animal may 
attack the apparatus, he may exhibit 


continuous tantrum behavior, or he 
may become cataleptic. Liddell has 
used several other procedures which 
are covered in detail in his 1944 
article, but in all instances his conclu- 
sions are basically the same. 

As a prelude to Liddell's basic the- 
sis, we might note that the feeble 
electric current used here is in dis- 
tinct contrast to the shock applied 
in traumatic avoidance learning. Lid- 
dell emphasizes that it is a startle 
stimulus rather than a pain stimulus; 
the current is set to be barely percep- 
tible on the moistened fingers of the 
experimenter, Consequently, dis- 
ruptions in behavior must be referred 
to the preliminary training proce- 
dures and the internal tension level of 
the animal, not to the traumatizing 
nature of the external stimulus. 

Liddell (1953) proposes that the 
vigilance reaction is the emotional 
foundation out of which experimental 
neurosis develops. He documents his 
thesis by observing that a primordial 
function of the nervous system is 
vigilance, watchfulness, and general- 
ized suspiciousness. This primitive 
sentinel activity is a behavioral 
equivalent of Cannon’s emergency 
reaction. It is graded in intensity and 
reveals itself in qualitatively diverse 
behaviors, ranging from a startle re- 
action to panic. The vigilance reac- 
tion constitutes an emotional sub- 
strate for behavior, and when raised 
to disabling intensity it will disrupt 
prior habits and the flexible adjust- 
ments needed to insure adaptive be- 
havior. 

Conditioned reflex techniques lay 
the first stone by introducing the 
animal to an unfamiliar situation in 
which he is restrained by straps and 
has portions of the apparatus at- 
tached to his body. A long period of 
training is required for the animal to 
submit docilely to the conditioning 
regimen, during which impulsive be- 
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havior is gradually subordinated to 
habits of remaining alert yet selí- 
contained and quiet while employed 
in the apparatus. Such restraints 
inevitably create tension in the ani- 
mal, revealed by periodic outbursts 
of tantrum behavior. Measures of 
respiration, heart rate, and gastroin- 
testinal activity similarly testify to 
internal arousal at the time the ani- 
mal appears to be quietly responding 
to the CS. If training is continued 
long enough, emotional arousal 
reaches a disabling intensity and dis- 
rupts organic processes as well as 
overt behavior. 

Whatever the neurological basis, 
behavior disturbances produced by 
this method seem to be facilitated by 
an absence of patterned motor activ- 
ity which would relieve the aroused 
state of the animal. One might argue 
that the animal's spontaneous be- 
havior, consisting mainly of efforts to 
escape, is gradually inhibited because 
it is ineffective in securing release 
Írom the confines of the Pavlov 
frame. But overt habituation does 
not signify a decline in emotional 
arousal. Self-restraint is maintained 
at some expense, and even a 
thoroughly trained animal is easily 
disturbed by events that increase 
arousal or otherwise depart from the 
normal training schedule. It appears 
that the animal can inhibit only a 
limited amount of tension before 
more primitive mechanisms in the 
nervous system effect a gross dis- 
charge. As Judson Herrick observed, 
the mammal is constructed to be 
active and cannot tolerate restric- 
tions in this sphere indefinitely with- 
out pathological consequences. 


Gantt on Experimental Neurosis 


For a period of some 12 years, 
Gantt (1944) intensively studied the 
behavior disturbances produced in 


one dog by conditioned reflex tech- 
niques. Basically, Gantt used a pro- 
cedure identical to that of Liddell ex- 
cept salivation was conditioned 
rather than leg flexion. After the 
neurosis was established, Gantt made 
extensive autonomic recordings and 
systematically altered features of the 
conditioning situation, always ob- 
serving whether the animal's symp- 
toms improved or degenerated. His 
conclusions have been validated with 
numerous other subjects, but Nick 
serves as the best focal point for dis- 
cussion. 

Once a discrimination between 
positive and negative conditioned 
stimuli had been established, Gantt 
gradually converged the two stimuli 
and forced the limits of discrimina- 
tion. Under these circumstances a 
widescale emotional reaction devel- 
oped that might be termed anxiety. 
'The animal was extremely upset dur- 
ing the conditioning session and 
actively resisted being placed in the 
apparatus. Autonomic changes ap- 
peared that surpassed in intensity 
the effects produced by such natural 
trauma as fighting, attack by another 
animal, or painful insult to the body. 
Respiratory difficulties, cardiac ac- 
celeration, increased blood sugar, and 
chronic pollakiuria are representative 
of the changes at this level. Coinci- 
dentally, these reactions became 
keyed to the CS; whenever the stimu- 
lus was presented a widescale and 
abrupt acceleration in autonomic 
processes immediately followed. And 
these dysfunctions were intractable; 
once elaborated in the form of a wide- 
spread anxiety reaction, they per- 
sisted erratically long after the more 
obvious signs of pathology had dis- 
appeared. At a more molar level, the 
behavior of the animal fell into a 
stereotyped format of overt symp- 
toms. Gantt refers repeatedly to the 
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"marked character, regular manifes- 
tation and stereotypy of pattern of 
the symptom complexes,” 

It is a matter of some consequence 
to understand how a situation that 
does not include pain can produce 
such an intense, chronic level of 
anxiety. Disturbed behavior is in- 
tuitively reasonable when consider- 
able amounts of punishment have 
been absorbed; but what is responsi- 
ble for breakdown in the artificial 
world of reflex conditioning, where 
the most innocuous of stimuli are 
used? 

Gantt and Liddell agree that con- 
ditioning in the Pavlov apparatus is 
essentially emotional in nature, and 
the nominal CR is but an incidental 
feature that best serves as an index 
of the underlying emotional state. 
Stable responses testify to reasonable 
stability and integration in the ani- 
mal's emotional reaction; unstable 
and fluctuating CRs are an indicator 
of widespread autonomic and be- 
havioral disruptions that may de- 
velop precipitately. Even in the tra- 
ditional salivary conditioning experi- 
ments where no instance of behavior 
disturbance is reported, the emo- 
tional undertone is clearly revealed 
during extinction. When meat pow- 
der is omitted, the animal exhibits 
increasing agitation following the CS, 
and at a later stage he may become 
extremely upset and attack the ap- 
paratus. While the salivary response 
drops out under these circumstances, 
a simple report of the number of un- 
reinforced trials to extinction hardly 
does justice to the complex features 
of the animal's behavior.? 


! In quite a different context, O. R. Lindsley 
(1956) has applied operant conditioning tech- 
niques to a psychotic population and he re- 
ports that chronic schizophrenics often urinate 
or defecate during the extinction phase, again 
suggesting a strong emotional involvement. 


Experimental neurosis capitalizes 
upon this emotional substrate of be- 
havior; in Liddell's terms, upon the 
innate vigilance reaction the animal 
brings to the conditioning situation, 
It plays upon processes endogenous 
to the organism; processes, in fact, 
that are at the heart of adaptation 
and survival. But in this instance 
the emotional processes are not keyed 
to the contingencies of the environ- 
ment. They are aroused in situations 
of no biological significance to the 
animal, and they cumulate because 
Spontaneous escape activity is in- 
hibited. So they pervert their normal 
function, contributing to disintegra- 
tion rather than adaptation. 


An Overview 


With all their diversity of emphasis 
and procedure, the experimental 
studies nevertheless are tied together 
by certain recurrent themes. Taking 
the studies as a group, two significant 
features stand out in all cases of be- 
havior pathology. On the one hand, 
the foundation for pathology is laid 
by a progressive state of emotional 
arousal that finally reaches disastrous 
proportions. Acute anxiety is the 
common denominator of these stud- 
ies, and at early stages it is expressed 
in autonomic fluctuations, in panic 
reactions, and in behavioral disor- 
ganization. Whether initiated by 
traumatically painful episodes or 
elaborated out of the vigilance reac- 
tion, anxiety is the basic operator in 
behavior pathology. 

Secondly, the constant feature of 
the behavioral symptoms is their 
stereotypy and repetitiveness. Once 
established, the symptoms are re- 
markably intractable to control by 
external reward or punishment. They 
may qualify as instrumental avoid- 
ance responses or they may simply 
include primitive response patterns 
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that were incidentally fixated, but in 
either case the behavior is continued 
long after the task requirements have 
changed. 
'These two characteristics of be- 
havior pathology, anchored as they 
are in careful experimental work, 
furnish substantial corroboration for 
the two similar features noted earlier 
in the clinical literature. We seem to 
be dealing here with principles of 
sufficient generality and power to 
produce consistent results even when 
a wide variety of species and proce- 
dures are sampled. The experimental 
studies do not provide any evidence 
on the third feature of clinical pathol- 
ogy, namely, the fixation of emotions 
and needs during early stages of de- 
velopment, but they were not de- 
signed to obtain data of this sort. 
Perhaps a group of experiments de- 
signed for this purpose, such as those 
of Hunt and his colleagues (1941, 
1947), would produce the type of data 
desired. 

If this be true, if in fact a set of 
principles can be established that 
apply to pathology in various species, 
then it would seem that we are in a 
more powerful position to isolate the 
basic conditions that underwrite pa- 
thology. Through animal studies the 
conditions that aggravate the emo- 
tional state of the organism can be 
explored, and nonverbal methods of 
therapeutic treatment can be syste- 
matically examined. There are 
enough striking parallels in symptoms 
between man and other mammals to 
suggest that valuable insights might 
be derived from such studies, insights 
that could be transposed and bene- 
ficially applied at the human level. 

This is not to suggest that human 
pathology is devoid of any distin- 
guishing characteristics. | Human 
pathology has many unique features, 
to be sure, features that are inter- 
woven with the advanced mental 
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processes available to man. One can- 
not fail to be impressed with the florid 
ideation and rich detail of schizo- 
phrenic thinking. But if disturbances 
in behavior are keyed principally to 
emotional conditioning, perhaps the 
cognitive processes serve chiefly to 
express the problem more complexly, 
to extend through language and idea- 
tion the range of relevant experiences 
that are associated with the patho- 
logical state. In this sense the pecu- 
liarly symbolic quality that enters 
into human disturbances may be con- 
sidered a secondary phenomenon, 
just as the ability to cast the problem 
in verbal terms and communicate it 
to a therapist is. They are adjuncts 
that testify to man's ability to sym- 
bolically represent his experience at 
several different levels. But the in- 
dispensable feature of pathology is the 
state of anxiety keyed to significant 
portions of the individual's experi- 
ence, not the special verbal or mental 
images through which the experience 
is elaborated, 


SUMMARY 


Two separate realms of discourse 
have contributed heavily to current 
conceptions of behavior pathology. 
The clinical realm, influenced largely 
by the theories of Freud, has offered 
dynamic interpretations of human 
disorders that are couched in a frame- 
work of drive, conflict, and defense. 
Its opposite number, experimental 
psychopathology, has been mainly 
occupied with animal studies in which 
behavior disturbances are methodi- 
cally produced under carefully con- 
trolled conditions. There has been a 
noticeable lack of interchange be- 
tween the two areas, yet a selective 
review of the literature suggests that 
behavior pathology in humans and 
animals may share some common 
principles. 
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In the clinical area there appear to 
be three general characteristics that 
apply to the functional behavior dis- 
orders. The first of these is the pres- 
ence of an intense anxiety reaction 
that disrupts goal directed behavior 
and mobilizes defensive processes 
aimed at warding off anxiety. 
Secondly, behavior relevant to the 
anxiety provoking situation becomes 
stereotyped and repetitive, furnish- 
ing the individual with a set of symp- 
toms that are remarkably intractable 
to change. Finally, needs and emo- 
tions operative at the point of severe 
frustration seem to be fixated, and 
consequently the steady progression 
to mature forms of emotional expres- 
sion and impulse control is disrupted. 
The individual is preoccupied with 
residual interests that are appropriate 
to an earlier phase of development, 
and current experiences are refracted 
to conform with these themes. 

The experimental literature reveals 
that the presence of acute anxiety and 
the formation. of stereotyped, repeti- 
tive symptoms are typical character- 
istics of this area as well, and these 
data provide a firm experimental 
foundation for the two similar char- 
acteristics observed in the clinical 
realm, The experimental studies 
yield no evidence about the fixation of 
emotions and needs because they 
were not designed to obtain data of 
this sort, but some suggestive results 
in this direction have been obtained 
by other experiments on infantile 
feeding frustration. 

In combination, the clinical and 
experimental research raises the pos- 
sibility that the same principles con- 
trol behavior pathology in more than 
one species. The unique features of 
human pathology seem to be trace- 
able to the complex cognitive proc- 
esses through which the problem is 
expressed, rather than a fundamental 
difference in how the pathology 


originates... We would tentatively 
conclude that the indispensable fea- 
ture of pathology is a strong anxiety 
reaction keyed to significant aspects 
of the individual's experience; and if 
this be valid, suggest further that 
animal research on nonverbal tech- 
niques of therapy might yield results 
that could be translated and bene- 
ficially applied at the human level, 
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METHODOLOGY OF GAIN STUDIES IN MAN- 
MACHINE SYSTEMS! 


C. B. GIBBS 
Defence Research Medical Laboratories, Toronto, Canada 


Thge are considerable difficulties in communicating data concerning the 
output/input amplitude relations of man-machine systems, and many 
arise from a lack of agreement on terminology and methods of measure- 
ment. The many different terms in use are compared for their clarity 
and convenience. The uniform use of the term "gain" is recommended; 
and the terms control gain, display gain, and system gain are defined and 
distinguished. Linear measures are generally used in studies of gain, but 
radial measures are superior for describing optimal limb movements for 


controlling machines. 


PROBLEM OF COMMUNICATION 


Beginning with Craik and Vince 
(1944, 1945) and Vince (1946) various 
investigators have studied the effects 
of changes in the magnitude of con- 
trol and display movements upon the 
operation of man-machine systems. 

any of the findings and conclusions 
of these studies are summarized in 
handbooks for the use of engineers. 
One such handbook (Woodson, 1954) 
recommends, presumably as a uni- 
versal optimum, a single control/dis- 
play (C/D) movement ratio for joy- 
stick controls. Another (Ely, Thomp- 
son & Orlansky, 1956) states that the 
available data cover few practical ap- 
plications and engineers are advised 
to run their own experiments to de- 
termine the “optimal C/D ratio" 
for a specific system. Some better 
solution is very desirable. It would be 
unfair to criticize reviewers for their 
discharge of an onerous and unen- 
viable task because in this area there 
has been a virtually complete break- 
down of communication. This is un- 
doubtedly the main cause of the 
existing confusion. 

Many different terms are com- 
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monly used in discussing input/out- 
put relations and little agreement ap- 
pears to exist as to their precise mean- 
ings. The term gain will be used in 
this paper to denote the display/con- 
trol (D/C) amplitude ratio. This is 
the reciprocal of the ratio which is 
most widely used at the present time. 
Difficulties in communication arise 
because most investigators have fol- 
lowed Jenkins and his co-workers 
(Jenkins & Karr, 1954, 1955; Jenkins, 
Mass, & Rigler, 1950; Jenkins & Ol- 
son, 1952) in using linear measure- 
ments when specifying the value of 
gain despite the fact that the move- 
ments of body members are basically 
radial about an anatomical pivot. 

The above problems can be solved 
only with some degree of agreement 
among different investigators con- 
cerning terminology and units of 
measurement. The purpose of the 
present paper is to clarify some of the 
issues. 


SEMANTIC PROBLEM 


The use of at least 10 different 
terms to designate changes of signal 
amplitude illustrates the need for a 
closer approach to uniformity when 
describing experimental method. A 
term is needed which can denote the 
area of study, specify a definite arith- 
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metic relation between input and out- 
put, and also the direction of ampli- 
tude change; i.e., whether the input 
or output signal is treated as the ref- 
erence value. It is important to con- 
form to the normal convention of di- 
rectional change which has been es- 
tablished in other branches of sci- 
ence; e.g., the terms optical magnifi- 
cation and reduction are always used 
in physics in relation to the source, 
i.e., the input signal. 

Many of the terms used in manual 
tracking do not specify the direction 
of amplitude change; for example, the 
terms “proportional control factor" 
(British Standards Institute Commit- 
tee, 1949), "visual display scale," 
“arm control scale" (Fitts, Marlowe, 
& Noble, 1953), "gear ratio," "stiff- 
ness," and “sensitivity.” 

- The term ratio meets an important 
requirement in specifying the direc- 
tion of amplitude change. The C/D 
ratio or its reciprocal, the D/C ratio, 
may be used. Depending on the ex- 
perimental conditions, either may 
have advantages for the purposes of 
computation and communication. 
Integral numbers are more con- 
venient than fractional, and it is ad- 
vantageous to use the ratio relation 
for which changes of magnitude are 
compatible with those of the meas- 
ured variable; i.e., an increase in the 
variable should lead to an increase in 
the value of the ratio. The C/D ratio 
usually gives the stated advantages 
when control or input parameters are 
studied, but the D/C ratio is usually 
better in these respects when display 
conditions are varied and those of 
control are constant. The former 
type of experiment is carried out more 
frequently than the latter, and the 
widespread use of the C/D ratio may 
have arisen because it gives some ad- 
vantages in computation and com- 
munication in the majority of cases. 
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The term has grave disadvantages 
however when it is used in a more 
general sense to connote studies and 
data on the relative magnitude of dis- 
play and control movements. One 
unit of output is taken as the refer- 
ence value in the C/D ratio and, in 
effect, this signal is traced backward 
through the system to specify the 
value of the initial input signal corre- 
sponding to this end result. There are 
well-known difficulties of thinking of 
temporal processes in reverse order 
and it is not surprising that at least 
four writers, who have made valuable 
contributions to the literature, are 
somewhat inconsistent in their use of 
ratios and their reciprocals and of 
nominal and mathematical expres- 
sions (e.g., Bennett, 1956; Jenkins & 
Karr, 1955; Rockway, 1954; Wood- 
son, 1954). 

The term gain is less confusing than 
C/D ratio because it specifies the 
forward or natural direction of am- 
plitude change and it conforms with 
the normal practice of treating the 
input signal as the reference value. 
The special definition of gain to de- 
note the output/input—i.e., D/C 
ratio—follows the convention estab- 
lished in related branches of engi- 
neering and this agreement on arith- 
metic relation is essential in the pres- 
ent state of communication. 

Another basis for comparison of 
terms arises from the need for clear 
and concise distinctions between the 
conditions of different studies. In one 
type of study, control movements are 
varied and those of display are con- 
stant. The optimal extent of move- 
ment is expressed conventionally in 
terms of an optimal C/D ratio, which 
in a typical study of joystick control 


! ? [n this situation the term "'control gain” 
is suggested connoting the constancy of the 
display conditions. 
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might have the value of 4 when radial 
measure is used. In another type of 
study, which may follow the first, dis- 
play factors are varied while those of 
control are*held constant.’ A typical 
finding would be that 4X Magnifica- 
tion—i.e., a D/C ratio of 4—is opti- 
mal for the particular tracking task, 
and this is equivalent to an opti- 
mal C/D ratio of .25. A still more 
extended study might follow in which 
both control and display factors are 
varied to determine the optimal con- 
trol conditions for different degrees of 
magnification. There may be no 
interaction beween the control and 
display variables and in such circum- 
stances the study would produce yet 
another optimal C/D ratio of 1—i.e., 
the product of the two previous op- 
tima. This ratio would vary however, 
if there were interactions between the 
variables. Verbal terms must be used 
which permit distinctions to be made 
between data from the three kinds of 
study and should follow the conven- 
tional practice of attaching a distinc- 
tive prefix to a more general term, as 
is done for example in distinguishing 
"propeller speed" from "engine 
speed." The terms display gain, 
control gain, and system gain dis- 
tinguish between the three kinds of 
study and the three different optimal 
ratios which may be determined. 
The frequent use of the term opti- 
mal C/D ratio in works of reference, 
and the quotation of numerical values 
without stating experimental condi- 
tions, can convey a number of mis- 
leading impressions. The implica- 
tions are that a simple ratio can ex- 
press the complex interaction of 
different display and control condi- 
tions, that the numbers themselves 


! Here, "display gain" would be used con- 
noting the constancy of control conditions. 

‘The term “system gain" would apply to 
this more general study. 


have special significance, and that 
they should be maintained over a 
range of conditions. It may errone- 
ously be supposed, e.g., that a de- 
crease in the extent of control move- 
ment must be compensated by de- 
creased magnification in order to 
maintain the supposedly optimal 
ratio with similar compensatory ad- 
justments for a change in any factor. 
Studies of system gain would be 
needed to provide reliable data for 
making such compensatory changes 
but no valid information is available 
at the present time. The prefixes 
"control" and “display” in the term 
optimal C/D ratio carry the mislead- 
ing implication that the associated 
number states the optimal conditions 
for both control and display as would 
be done by a statement of “optimal 
system gain." In fact, the optimal 
C/D ratios which are usually quoted 
in works of reference relate only to 
studies of control. 

There can be no objection to the 
use of any ratio term which may be 
useful in computation, or for making 
precise and limited verbal distinc- 
tions. There is a further need, how- 
ever, for a term which denotes an area 
of study and can be used to make 
more specific distinctions by the addi- 
tion of a suitable prefix. Terms which 
contain the word ratio cause consid- 
erable confusion when used for the 
latter purpose, and the term gain is 
therefore recommended. 


PROBLEM OF MEASUREMENT 


The desirable attributes of a sys- 
tem of measurement are that it 
should have reasonably general ap- 
plicability, be easy to use, and re- 
quire the minimum number of differ- 
ent statements to specify any impor- 
tant general quantity such as optimal 


control gain. L 
Linear measure, which is conven- 
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tionally used in manual tracking, does 
not meet many of the stated require- 
ments. Some basic determinants of 
optimal gain, such as limb tremor and 
visual acuity thresholds, are fre- 
quently expressed in radial measure. 
At best, statements of gain in linear 
measure require laborious recalcula- 
tion before they can be related to 
basic data expressed in radial meas- 
ure. Conversion is frequently impos- 
sible because essential data—e.g., 
viewing distance—and the limb used 
for control are not stated in descrip- 
tions of experimental method. 
The linear and radial conventions 
have been compared for generality 
and conciseness in dealing with ex- 
perimental data when the thumb, 
hand, and forearm were used to con- 
trol positional and velocity systems 
(Gibbs, 1962). It was found that for 
optimal conditions all limbs moved 
through equal angles about their 
anatomical pivots. The linear dis- 
placements of the limb extremities 
differed greatly, depending on their 
distance from their respective ana- 
tomical pivots. One common state- 
ment of control movement in radial 
measure defined the optimum for all 
three members, but three different 
statements would have been needed 
if linear measures had been used. 
This example illustrates the close 
interrelation of the problems of 
semantics and measurement. Data 
expressed in radial measure have 
wider applicability than those in 
linear measure, probably because 
muscles and their associated sensory 
tracts control movements which are 
basically radial. It may be generally 
true that physical measures of ob- 
servable activity will have their 
widest applicability when they corre- 
spond to the quantities which are di- 
rectly controlled by covert physio- 
logical mechanisms. The physical 
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quantities of force, extent, and speed 
have been measured in different 
studies of control gain. It is probable 
that the stimulation of kinesthetic 
receptors, which are sensitive to force 
as well as movement (Matthews, 
1933), is a main sensory determinant 
of optimal control gain and in some 
some cases it is advantageous to 
express optimal control conditions 
in terms of the forces exerted at 
the control (Gibbs, 1954). The re- 
sponse of kinesthetic receptors in 
muscles is correlated with the rate of 
muscular contraction when move- 
ments are made. Measures of the 
speed of movement may therefore be 
more widely applicable than measures 
of extent. It could be significant that 
the optimal speed of handwheel rota- 
tion for visual tracking (Helson, 
1949) agrees closely with the hand- 
wheel speed of an old-fashioned 
hurdy-gurdy where the output is 
auditory. In this case, the optimal 
input conditions in rpm apply to 
very different outputs. The semantic 
difficulties caused by a ratio relation 
do not arise in this case where speed 
is the measured variable, but they do 
occur in the more numerous cases 
where it is easy to measure control 
displacement but difficult to measure 
speed. 

The measurement of display gain 
presents difficult problems, some of 
which arise from the complex “ge- 
ometry" of visual space (Luneburg, 
1947). Shackel (1954) and Jenkins 
and Karr (1955) have studied opti- 
mal display gain when the viewing 
distance was varied, but their results 
point to opposite conclusions on the 
issue of whether radial or linear 
measures should be used in specifying 
optimal display gain. It may well be 
that the simple measurement systems 
which have been used to date cannot 
give a direct and widely applicable 
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expression of the numerous and com- 
plex factors which effect optimal dis- 
play gain. The rate of visual change 
is probably an important factor but 
neither linear nor angular measures of 
the extent of display movement can 
express directly the effect of dynamic 
factors. It is probable that increased 
attention to the problems of measure- 
ment would produce results of wider 
applicability in studies of display, as 
well as control gain. 
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THE EQUIVALENCE OF MEASURES AND THE 
CORRECTION FOR ATTENUATION 


JACK BLOCK 
University of California, Berkeley 


In literature reviews and critiques, measures are often evaluated in re- 
gard to their conceptual equivalence. In making this evaluation, the 
empirical correlation betwen the 2 measures being compared should be 
corrected for attenuation at least approximately. Unless this correction 
is applied, alternative measures may be presumed to be importantly 
different when, giving due weight to the unreliabilities present, it may be 
seen that the identical underlying dimension is being reflected. In 
psychology, where enough discrepancies already characterize our find- 
ings, this interpretive error should not deny us the occasional equiv- 


alences that come along. 


In literature reviews and critiques, 
a particular error frequently arises in 
the way in which research findings are 
construed. Because many readers 
lean on these integrating articles as a 
source of their knowledge and atti- 
tudes, the ramifications of a faulty 
evaluative logic in these reviews can 
be widespread. Accordingly, the 
error involved requires correction or 
at least recognition so that perspec- 
tives may be shaped more appropri- 
ately. 

The point to be developed in the 
present note is a simple one, learned 
in most elementary courses in psy- 
chological statistics, but neglected by 
many psychologists thereafter. Spe- 
cifically, when two measures are be- 
ing evaluated in regard to their con- 
ceptual equivalence, the empirically- 
obtained correlation between them 
quite properly should be corrected for 
attenuation, Unless this correction is 
applied, or the effect of the unrelia- 
bilities present is recognized, inter- 
pretation of the raw correlation be- 
tween two measures of the same con- 
struct can be misleading in funda- 
mental ways. In particular, alterna- 
tive operational measures of a con- 
cept may be presumed to be impor- 
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tantly different when, giving due 
weight to the unreliabilities present, 
these measures in fact may be seen to 
reflect the same underlying dimen- 
sion. In psychology, where enough 
disparities, inconsistencies, and puz- 
zles already characterize our findings, 
this kind of interpretive error should 
not deny us the occasional congruen- 
cies that come along. 

Two articles in a recent issue of this 
journal manifest this failure to rec- 
ognize the effect of measure unreli- 
ability upon the consequent interpre- 
tation of research results. Witten- 
born (1961), in a review of the status 
of Q methodology, reports on a study 
by Block (1957) where ipsative Q 
items were correlated with essentially 
matching normative-type ratings. 
Wittenborn cites the correlations be- 
tween the various matched items as 
ranging from .64 to .88 in one sample 
and from .31 to .74 in another study. 
He concludes, “Apparently, the error 
involved in using ipsative item scores 
in a normative manner may vary 
greatly from item to item and from 
sample to sample’ (Wittenborn, 
1960, p. 135). 

The tenor of this conclusion sug- 
gests a lack of dependability in the 
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equivalence of the two kinds of meas- 
ures. A reader unfamiliar with the 
original study could again, and with a 
sigh, bewail the excessive effect of a 
method per se on the characteristics 
of the scores derived. Closer inspec- 
tion of the original study, however, 
suggests a radically different conclu- 
sion is warranted. Although unre- 
ported by Wittenborn, the original 
article presented the reliabilities of 
the ipsative and of the normative 
ratings and the conceptual equiva- 
lence of the scores separately derived 
had been evaluated by correcting the 
raw correlation of matched variables 
for attenuation. Now the range of 
correlations is very different—from 
.83 to unity in the one study and from 
.63 to unity in the other (the lower 
level of corrected correlations in the 
second study was ascribed, in the 
original article, to poorer matching 
of the paired variables). The conclu- 
sion that follows from this alternative 
(and, we suggest, more correct) 
analysis is appreciably more positive 
in its tone and implications than 
Wittenborn's reporting indicates. 

As another instance of failure to 
recognize the effects of unreliability, 
we may cite the critique by Crowne 
and Stephens (1961) of the method- 
ology employed in studies of self- 
acceptance and self-evaluative be- 
havior. In their review, Crowne and 
Stephens (1961) repeatedly call atten- 
tion to the necessity of empirically 
testing the equivalence of similarly 
labeled measures. ''Tests of self-ac- 
ceptance...in the development of 
which different procedures and items 
have been employed are not equiva- 
lent in the absence of empirical demon- 
stration of their relationships" (p. 107). 

Theirs is a salutary emphasis, one 
warranting widespread implementa- 
tion by psychologists prior to the 
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seeking of publication outlets for re- 
search. Unfortunately, in the specific 
way Crowne and Stephens objectify 
their criticism of untested presump- 
tions of the equivalence of similarly 
labeled measures, they do not recog- 
nize the attenuating effects of un- 
reliability and thus are enabled to 
paint too black a picture of the state 
of self-acceptance research. 

Three studies are cited in the 
Crowne and Stephens (1961) review 
as testing directly the equivalence of 
self-acceptance measures. Reported 
are (a) the finding by Bills (1958) of 
correlations of .24 and .56 between 
self-acceptance scores derived from 
his Index of Adjustment and Values 
(IAV) and a self-acceptance score 
calculated from a questionnaire de- 
veloped by Phillips (1951); (6) the 
finding by Omwake (1954) of correla- 
tions of .55 and .49 between IAV-de- 
fined self-acceptance on the one hand 
and self-acceptance scores derived, 
respectively, from the Phillips ques- 
tionaire and from a scale constructed 
by Berger (1952)—the correlation of 
.73 between the Phillips and Berger 
self-acceptance scores reported by 
Omwake is not mentioned by Crowne 
and Stephens; (c) the finding by 
Cowen (1956), in two samples, of 
correlations of —.07 and .06 between 
an IAV-derived score reflecting self- 
ideal-self congruence and “‘stability of 
self-concept” as measured by Brown- 
fain (1952). Cowen’s concomitant 
findings are not mentioned by 
Crowne and Stephens, that IAV 
self-ideal-self congruence correlates 
.43 and .62 in one sample and .44 and 
.29 in another sample with scores re- 
fiecting positiveness of self-regard. 
Crowne and Stephens are distressed 
by the generally low correlations they 
report and suggest that here is psy- 
chometric evidence that, conceptu- 
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ally, the various notions of self-accept- 
ance held are importantly different. 

Although their interpretation con- 

ceivably, and in certain respects even 
quite likely, is a correct one; an al- 
ternative evaluation of these relation- 
ships can be maintained. One obvious 
remark, which must be made al- 
though its full development would 
require too much of a digression for 
the purposes of this note, is that a 
priori there was good conceptual rea- 
son not to expect high correlations 
among the measures being correlated. 
These measures have not been pre- 
sumed to be theoretically equivalent 
by their originators or by the re- 
searchers who have studied them. 
Rather, they have been recognized as 
likely to be related and studied ac- 
cordingly. One does not expect 
creativity to correlate perfectly with 
intelligence but an association of the 
two variables certainly is to be ex- 
pected. This argument applies—to 
an extent determined by one’s evalu- 
ating inclinations—to all of the stud- 
ies cited by Crowne and Stephens 
(1961) but applies with special force 
to the Cowen correlations they men- 
tion. There was no conceptual justi- 
fication for anticipating an equiva- 
lence between the Bills measure of 
self-ideal-self congruence and the 
Brownfain measure of the difference 
between one’s optimistic self-picture 
and one’s pessimistic selí-picture. 
(The Brownfain-derived difference 
could be small because both self-pic- 
tures are close to an ideal or the 
difference could be small because 
both self-pictures portray an indi- 
vidual distant from the person he as- 
pires to be.) 

Our central retort to the finding of 
only moderate correlation between 
these at least broadly-related meas- 
ures takes a different tack. Before 
interpreting these obtained correla- 


tions, it is necessary to inquire: What 
are the reliabilities of the measures 
employed and what would the corre- 
lations among them be when cor- 
rected for attenuation? "These ques- 
tions often are not directly answer- 
able for researchers, on the whole, 
have manifested an unconcern with 
reliability ^ estimation—very few 
measures come forward into the liter- 
ature with an indication of their func- 
tional stability. Reliabilities for the 
self-acceptance measures employed 
are not reported in the studies cited 
by Crowne and Stephens (1961) but, 
reasoning from similar measures, sen- 
sible guesses can be made as to the 
range within which these reliabilities 
would fall. For the sake of specificity, 
let us settle at an estimated relia- 
bility of .7 for each of the self-accept- 
ance measures, This figure is perhaps 
an optimistic one, considering the 
reliabilities, for example, of MMPI 
scales (Dahlstrom & Welsh, 1960). 
But, since the argument to be illus- 
trated is weakened by higher rather 
than lower reliabilities, this estimate 
certainly is not unfair. 

Now, when reliability estimates of 
.7 are plugged into the formula for 
the correction for attenuation, a cor- 
relation of .24 becomes .34; correla- 
tions of .56, .55, and .49 become .80, 
-79, and .70, respectively. The cor- 
relations, not mentioned by Crowne 
and Stephens (1961) when corrected, 
become 1.00 (Omwake, 1954) and .61 
.89, .64, and .41 (Cowen, 1956). 
These figures are high ones. (Obvi- 
ously, the zero-order correlations of 
Cowen can show no such drastic 
changes. When the large and in- 
escapable contribution of “method 
variance" (Campbell & Fiske, 1959) 
is recognized as still present and 
operating to lower correlations, these 
corrected correlations of self-accept- 
ance measures may be viewed as even 
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more striking evidence of an essential 
equivalence in the way the notion of 
self-acceptance has been represented 
by the several measures. 

Stronger support for the equiva- 
lence of self-acceptance measures 
comes from the most comprehensive 
study thus far of the correspondence 
among such measures (Crowne, Steph- 
ens, & Kelly, 1961). This research 
was completed too late for inclusion 
in the Crowne and Stephens (1961) 
review. In this study, the average 
correlation among a variety of meas- 
ures of self-acceptance was, in three 


' different samples, .53, .52, and .51. 


Reliabilities are not reported but 
again estimating the average reli- 
bility of these measures to be .7, the 
average correlation among these dif- 
ferent measures—corrected for at- 
tenuation— proves to be .76, .74, and 
.73. Again, taking cognizance of the 
further attenuating effect of method 
variance, it would appear that 
Crowne, Stephens, and Kelly have 
found their subjects order themselves 
almost equivalently on a self-accept- 
ance dimension, regardless of the 
particular measure employed. This 
writer surmises that the congruence 
comes about not so much from an 
underlying formal equivalence in the 
way self-acceptance has been concep- 
tualized, but rather results from the 
lack of fundamental diversity in the 
ways researchers have chosen to 
operationalize the notion of self-ac- 
ceptance. 

It should be emphasized that the 
principle employed in correcting ob- 
tained correlations, thus boosting 
them in the present instances to ex- 
treme values, represents no slight-of- 
hand. By the straightforward appli- 
cation of established procedures for 
improving the reliability of measures, 
we may fully expect these theoretical 
correlations to become actuality. In 


the immediate situation, however, 
when we must exist in the presence of 
unreliability, we should employ the 
correction for attenuation because it 
provides an estimate of the corre- 
spondence between two measures in 
the limiting case of perfect reliability. 
This corrected correlation answers the 
question we ask when we attempt to 
evaluate the conceptual equivalence of 
measures, Although there are prob- 
lems in implementing this correc- 
tion—when no reliability estimates 
are available or when reliability co- 
efficients themselves fluctuate as a 
function of their method of compu- 
tation—these are difficulties of detail 
rather than of principle. The limits 
within which a reliability coefficient 
undoubtedly must reside can be 
readily calculated or sagely, soberly 
estimated by a knowledgeable re- 
searcher. Or alternatively, where no 
more is required, the simple recogni- 
tion of the attenuation principle is 
enough to bring a more accurate 
perspective to bear on the raw (and 
ambiguous) correlation between two 
nominally equivalent measures. 
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BEHAVIORAL EFFECTS OF IONIZING RADIATIONS: 
1955-61! 


ERNEST FURCHTGOTT 
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A review of the rapidly increasing literature, The conclusions included: 
(a) Mammals irradiated pre- or neonatally show relatively permanent 
deficits in several behavioral domains, such as learning, motor functions, 
mating, etc. (b) Some investigators report that even small doses affect 
neural functions and consequently CR acquisition while others find no 
effects even with large doses, (c) Radiation, however, may be a UCS in 
avoidance conditioning. Also, several motivational variables are af- 
fected. (d) Except for vision where the results are equivocal the data on 
sensory functions are scant. (e) There is little evidence for long-term 
human changes. (f) It is emphasized that radiation may bea potentially 


useful tool in several areas. (195 ref.) 


In the first review of the behav- 
ioral effects of ionizing radiations 
published 5 years ago (Furchtgott, 
1956), there were 16 citations to 
journal articles dealing with what are 
usually considered primarily behav- 
ioral problems. Six of these articles 
appeared prior to the present modern 
era of nuclear science, i.e., prior to 
1946. Thus during the decade of 
1946 to the middle of 1955 there were 
only 10 publications pertaining to 
behavioral effects. In contrast, the 
present review for the 6-year period 
of mid-1955 to mid-1961 cites over 
100 publications in this area. This 
rapid growth in research, therefore, 
warrants a new examination of the 
current status of the work in this 
field. 

Aside from the inherent interest for 
theoretical or practical reasons in the 
effects of radiations per se, there has 
been an increasing use of radiations as 
a tool in the study of certain biologi- 
cal problems, and there are a number 
of additional domains, including be- 
havior, in which this tool could be 
profitably applied. 

! Preparation of this review was supported 
by Research Grant M 1064 from the Na- 
tional Institute of Mental Health, United 
States Public Health Service. 


The writer assumes that the reader 
is familiar with some of the basic con- 
cepts in radiation biology. The previ- 
ous review (Furchtgott, 1956) pre- 
sented the terminology and concepts 
which are necessary for an under- 
standing of the work on the behav- 
ioral effects of ionizing radiations, For 
a more comprehensive treatment of 
radiation biology the reader should 
consult the 2-volume treatise edited 
by Hollaender (1954) or some stand- 
ard text in the field such as Spear 
(1953), Lea (1955), or Fritz-Niggli 
(1959). 


DEVELOPING ORGANISMS 


It has been known almost since the 
discovery of X rays that proliferating 
tissues are especially radiosensitive 
(law of Bergonié & Tribondeau). 
Ionizing radiations, therefore, affect 
developing organisms to a much 
greater extent than they do adult 
forms. Of especial interest to the 
psychologist is the extreme sensi- 
tivity of the neuroblast, the stage be- 
tween the primitive neuroectoderm 
and the developed neuron (Hicks, 
1954). Rugh (1959) has recently ably 
reviewed the embryological aspects 
of vertebrate radiobiology. With a 
few exceptions, however, he omitted 
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the behavioral literature and the 
voluminous Russian contributions. 


Anatomical and Neurological Changes 


For a more meaningful presenta- 
tion of the behavioral findings it 
seems helpful to review the neuroana- 
tomical and neurophysiological data. 
The latter material is not treated ex- 
haustively, however, since that is not 
the primary aim of this paper. 

Prenatal Irradiation. Russell (1954) 
in reviewing the effects of radiation 
on mammalian prenatal development 
pointed out that anatomical changes 
are maximal if the treatment occurs 
during the period of major organo- 
genesis, which in the rat corresponds 
to Postconception Days 8-15 and 
in man to approximately Weeks 

2-6. Furthermore, on the basis of 
a number of studies which she re- 
viewed, she concluded that irradia- 
tion during the implantation period 
(Days 0-7 in the rat) results in 
a high percentage of deaths, but 
that virtually all survivors are nor- 
mals. Hicks (1954), Lengerová 
(1957), and Cowen and Geller (1960) 
also found that irradiating rat em- 
bryos prior to Postconception Day 
9 does not affect the central nervous 
system (CNS). 

Recently, however, Rugh and 
Grupp (1959a, 1959b, 1960) reported 
that irradiating mice even during the 
preimplantation period with as little 
as 50 roentgens (r.), administered in a 
single or two 25-r. doses, produces 
exencephaly in approximately 1% of 
the embryos. In the 1959 study no 
exencephaly was observed in 630 im- 
plantations in normal animals, while 
in a group of 625 embryos which re- 
ceived a single dose of 50 r. between 
Days .5 and 9.5 exencephaly was 
noted in 1.6%. In a 200-r. group 
(N —681) the percentage was 7.0. In 
the fractionated 50-r. group (two 
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25-r. doses), out of a total of 3,598 
implantation sites, .5% resulted in 
exencephaly. Admittedly these are — 
very small percentages, but the in- 
vestigators speculate that it is highly 
plausible to assume that the litter- 
mates of the exencephalic mice might 
also have carried cytological or histo- 
logical defects even though they did 
not exhibit the gross anomaly. lf 
these findings should be confirmed on 
other species and also by other in- 
vestigators, the concept of critical 
periods will have to be extended from 
a narrow specific stage to almost the 
entire embryonic and perhaps even 
the fetal stage. Nevertheless, as we : 
mentioned above, since Rugh's find- 
ings seem to be in sharp contrast to 
most previous studies, corroboration 
of these results would seem to be 
highly desirable. 

Hicks and co-workers (Hicks, 
Brown, & D'Amato, 1957; Hicks, 
D'Amato, & Lowe, 1959) have been 
continuing their analyses of the time- 
table of radiation-induced malforma- 
tions. They have been irradiating 
rats and mice at known stages of 
gestation and examining the subse- 
quent histological changes in the em- | 
bryos and fetuses. They have col- 
lected data from the one-somite em- ~ 
bryo on, somite by somite stage, UP 
to Day 20 of the gestation period. 

Hicks (1958) has carefully pre- : 
sented the case for the role of radia- 
tion as a tool in mammalian develop- 
mental neurology. By irradiating 
the embryo or fetus at a known age 
and thereby destroying selectively 
cells which are just at the neuroblast 
stage, he makes a specific timetable 
of the development of the CNS feast- 
ble. Of especial interest to the stu- i 
dent of psychology, concerned with 
the cerebral cortex, is the potenti 
use of this technique for unraveling 
the mechanics of cortical formation 
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in mammals, about which very little 
has been known so far. Hicks has 
shown that the cortex does not ex- 
pand evenly, that it is not formed 
layer upon layer, and that all of the 
phylogenetically primitive parts do 
not develop prior to the neocortex. 
The dorsal pallium does not develop 
substantially until late in gestation. 
Another problem which Hicks has 
been able to study by means of radia- 
tion is the course of cell migrations 
from neuroectoderm to cortex. This 
study may shed some light on the con- 
nectionistic-versus-nerve outgrowth 
theory controversy (Morgan & Stel- 
lar, 1950, pp. 326-327). Do the in- 
coming fibers imprint a functional 
specificity on the young neural cells? 
Oris there a pattern of predetermined 
cell types in the neuroectoderm so 
that the right kind of cell will meet 
its matching fibers? 

A number of other investigators 
have also studied the CNS changes 
resulting from irradiation during the 
developmental period. Murakami 
and Kameyama (1958) irradiated 
mice with doses ranging from 25 to 
150 r. on Day 8 of gestation and ex- 
amined the embryos on Day 13. The 
CNS damage was already apparent 
in 15% of the recovered embryos in 
the 25-r. group while the percentages 
for the 50-, 100-, and 150-r. groups 
were 25, 77, and 94, respectively. 
The abnormalities induced by the low 
doses included hydrocephalus, brain 
hernias, and flexed spinal cords. This 
experiment thus presents specific 
changes based on a relatively large 
sample (there were 138 embryos in 
the 25-r. group) after a relatively low 
dose of only 25 r. 

The lasting CNS effects in prena- 
tally X-irradiated rats were studied 
by Cowen and Geller (1960). The 
subjects (Ss) were exposed in utero to 
250 r. at various stages of gestation. 


The animals were sacrificed for ma- 
between the ages of 3-19 months. A 
variety of gross and microscopic mal- 
formations of the brain were ob- 
served. The changes appeared to be 
the end result of disturbances in 
organogenesis marked by a lack, de- 
creased number, or spatial displace- 
ment of nerve cells and processes, 
There were no cytopathologic 
changes in neurons or abnormalities 
of the glia or blood vessels. Forebrain 
defects were most severe in animals 
exposed on Days 15 and 16. Cerebel- 
lar malformations were associated 
with later exposure dates. Riggs, Mc- 
Grath, and Schwartz (1956) irradi- 
ated rats 5-9 days before birth (ap- 
proximately 12-16 gestation days) 
with 150 r. On Postnatal Day 60 the 
Ss were sacrificed, and the brains 
were examined. Maximal changes 
were observed in the neopallium, but 
differential changes in the various 
structures were not associated with 
the age at irradiation. The authors 
thus question the precise specificity 
of the critical periods concept, at 
least for the period which they stud- 
ied. 

Piontkovskiy and Kolomeitseva 
(1959) administered 200 r. to rats on 
Day 18 of gestation and then ex- 
amined the brains at the age of 45 
days. The brain weights of the ex- 
perimental animals were less than 
those of controls, and the damage 
noticeable even with the naked eye 
was most pronounced in the cerebral 
hemispheres. The upper cortical 
layers were very thin, especially in 
the dorsomedial parts, and Ammon's 
horn and the corpus callosum were 
poorly developed. Alexandrovskaya 
(1959) irradiated rats (150-200 r.) on 
Day 12 of gestation and then ex- 
amined the brains at the age of 16 
months. Again, atrophy of the cere- 
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‘bral’ hemispheres was pronounced. 
Lengerová (1957) irradiated rat em- 
bryos directly, using 200-400 r., while 
shielding the mothers between Days 
1 and 13 of the gestation period. In 
accord with Hicks’ findings (1953) no 
CNS damage was observed in animals 
exposed prior to Day 8. Semagin 
(1959) administered 10 r. on each day 
of the gestation period (total dose 
was 200 r.) to seven rats. At 5 months 
of age the ratio of brain weight to 
body weight was significantly smaller 
(p<.01) in the irradiated than in the 
control rats. 

Ivanitskiy (1959a) ^ compared 
EEGs of rabbits which had received 
300 r. on Day 23 of gestation with 
control Ss. The irradiated group 
showed the following characteristics: 
lowered average voltage, increased 
percentage of low frequency waves 
and spindles, smaller frequency and 
voltage changes following light stimu- 
lation, longer aftereffects, and more 
frequent paradoxical reactions. 

Sikov and Noonan (1959) injected 
pregnant female rats with various 
doses of radioactive phosphorus (P*?) 
on Days 6, 8, or 10 of gestation. 
Brain defects occurred in the 6- and 
8-day groups, even though most stud- 
ies of X-ray treatments administered 
on those days have failed to come up 
with such teratogeneses. The authors 
speculate that treatment prior to the 
critical days sensitizes the embryos so 
that even though on the latter days 
the radioactivity remaining from the 
earlier injections is less than the tera- 
togenic X-ray dose, it is still sufficient 
to produce central nervous system 


injuries. 
Postnatal Irradiation. Clemente, 
Yamazaki, Bennet, and McFall 


(1960) and Yamazaki, Bennet, Mc- 
Fall, and Clemente (1960) studied 
the neurological and morphological 
changes in rats whose heads were ir- 


radiated between 8 hours and 15 days , 


postnatally with doses ranging from 
125 to 1,000 r. The neurological signs 
which were observed included tremor, 
leg dragging, head deviations, unco- 
ordinated gait, lunging movements, 
elevated pelvic posture, and chroni- 
cally abducted limbs. Observations 
were continued for 14 months. In 
general, the development of patho- 
logic signs was directly related to the 
dose and inversely to age. The ani- 
mals tended to develop a certain 
radioresistance after the first post- 
natal week. Examination of the 
brains of the animals showed a very 
high degree of correlation between the 
neurological signs and the patho- 
logical changes. The lesions were 
found most frequently in the sub- 
cortical white matter, basal ganglia, 
hypothalamus, cerebellum, and me- 
dulla. It is noteworthy that most of 
the clinical neurological signs were 
related to motor activities, Kosmar- 
skaya and Barashnev (1958) also ir- 
radiated rats between Postnatal Days 
1 and 33 with doses of 250 to 500 r. 
They observed decreases in the size of 
the brain associated with irradiation 
during the first 5 days of life. Treat- 
ment on Day 14 did not result in a 
size decrement, but the shape of the 
cerebral hemispheres was different 
from that of a control group. In line 
with other investigations Kosmar- 
skaya and Barashnev emphasized the 
sensitivity of the cerebellum during 
the postnatal period. 


Summary 


Anatomical and neurological CNS 
changes primarily in the rat and 
mouse following irradiation during 
the developmental period have been 
studied by a large number of investi- 
gators. The treatments have been 
applied at almost all stages of de- 
velopment. Contrary to previously 
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held views, one investigator observed 


changes even in embryos exposed dur- - 


ing the periods of preimplantation 
and major organogenesis. The effec- 
tive dose was reported to be as low as 
25 r. for certain periods of treatment. 

A variety of structural defects has 
been described and these have been 
related to some extent to the stage at 
which irradiation has been applied. 
However, the very narrow, in terms 
of time, application of the concept of 
critical period has been questioned by 
several investigators, 

Hicks has discussed the feasibility 
of using irradiations as a tool in the 
study of neural embryology. 


Behavioral Studies 


It is not surprising that the ana- 
tomical findings of high CNS sensi- 
tivity should have led to behavioral 
studies. As a matter of fact, even in 
the absence of positive morphological 

- findings it would seem we have ample 
‘precedence for such a strategy, There 
are a number of agents which produce 
behavioral changes without any ac- 
companying histological changes de- 
tectable with presently available 
techniques. For example, electric 
currents producing convulsive shocks 
and, subsequently, memory deficits 
do not bring forth any noticeable 
histological changes (Siekert, Wil- 
liams, & Windle, 1950). 

Learning. Furchtgott, Echols, and 
Openshaw (1958) administered 100— 
300 r. of X rays to albino rats on 
Days 14 through 18 of the gestation 
period or neonatally and then tested 
them at the age of 45-50 days in a 
Lashley III maze. Learning deficits 
were inversely related to age at radia- 
tion. In the Days 14-15 group, a 
100-r. dose was already effective, 
while neonatally 300 r. was necessary 
to produce decrements. In a similar 
study Levinson and Zeigler (1959) ex- 


tended the age at irradiation factor 
up to 24 days post partum. They 
used both a Lashley 111 maze and a 
Hebb-Williams closed-field test, The 
doses ranged from 150 to 350 r. Rats 
irradiated during the first 4 days of life 
showed the greatest deficits, while 

those irradiated after Day 18 did not 

differ from control Ss. Sharp (1961) 

administered 280 r. between Days 

10 and 17 directly to the rat fetus, 

The irradiated Ss were inferior to 

control Ss at the age of 115 days in 

a 14-unit water T maze. Graham, 

Marks, and Ershoff (1959) in testing 

brightness discrimination found that 

rats irradiated on Days 10 or 18 of 

the gestation period with 150 r, did 

not differ from controls, while 150 r. 

on Day 14 or 300 r. on Day 18 de- 

creased the rate of acquisition. 

Furchtgott and Wechkin (1962) 
found that avoidance conditioning in 
a Mowrer-Miller box was more rapid 
in rats which received 200 r. on Day 
16 in utero than in control Ss. Since 
irradiated rats show greater activity 
and fearfulness than control animals 
(Furchtgott & Echols, 1958a) and 
rate of avoidance conditioning is 
positively related to both of the 
above variables, the results are not 
surprising. 

The data for the above studies 
using rats are consistent in showing 
learning changes which are related to 
the dose and age parameters. Maxi- 
mum deficits have been observed 
around Day 14 of gestation, coincid- 
ing with the period of maximal sensi- 
tivity of the cortex (Hicks, 1953). 

A number of Soviet investigators 
have been studying instrumental 
conditioned responses (CRs) in rats. 
Piontkovskiy and  Kolomeitseva 
(1959) tested 16 animals irradiated 
on Day 18 of the gestation period and 
16 controls at the age of 45 days. The 
irradiated Ss reached a stable level of 
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responding (the criterion was not 
stated) significantly later than did 
the controls (p<.01). The mean 
number of trials to stabilization was 
97.6 for an auditory and 36.0 for a 
visual stimulus in the irradiated 
group, while for the controls the cor- 
responding values were 23.0 and 21.5, 
respectively. Extinction of the CRs 
and the acquisition of discriminatory 
auditory CRs was also slower in the 
experimental group. The authors 
further noted that irradiated Ss fre- 
quently did not pick up or eat the 
food following the instrumental re- 
sponses. We may, therefore, wonder 
whether the authors controlled possi- 
ble drive differences. Similar results 
were also reported by  Semagin 
(1959), who compared seven rats 
which had received 10 r. daily during 
the gestation period (total dose was 
200 r.) with seven controls. The test 
trials began at the age of 42 days. 
Michailova (1960) found changes in 
conditioning in rats already after a 
50-r. dose administered on Day 12 of 
gestation, but a 200-r. group showed 
a much more pronounced deficit. 

In a study by Chesnokova (1959) 
one group of rats received 50 r. of 
irradiation on the first day of life, a 
second group at the age of 18-20 
days, and a third group was kept as 
controls. Instrumental conditioning 
beginning at the age of 20-25 days 
indicated that the Day 1 group was 
significantly inferior to the controls, 
but the Days 18-20 group was similar 
to the controls. Those data corre- 
spond essentially to the findings of 
Levinson and Zeigler (1959). Piont- 
kovskiy and Kruglikov (1960) irradi- 
ated rabbits on Day 23 of gestation 
with 400 r. and tested for the estab- 
lishment of a conditioned shake re- 
sponse beginning on Day 3 after 
birth. The mean number of trials to 
establish the CR (the criterion was 
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not stated) was 43.4 (SD =4.0) for 
the irradiated group and 17.2 (SD= 
2.7) for the controls (b<.02). The 
irradiated Ss then required a mean of 
71.3 (SD=5.3) trials to achieve a 
discriminatory CR, while the con- 
trols required only 32.3 (SD=2.7) 
trials. 

Haefner (1960) exposed mice to 
235-350 r. on Days 6, 12, or 14 of 
gestation. He found decreased run- 
ning speeds when these Ss were tested 
as adults in a complex maze which 
was similar to a Lashley Type III 
maze. Errors were not measured, but 
the author assumed that running 
speed was an index of their rate of 
learning. 

In contrast to all the above studies 
Meier (1959) reported that chicks 
irradiated during their embryonic 
period do not show deficits in solving 
a modified Dashiell maze. The chicks 
received a near median lethal dose on 
either Days 4, 8, 12, 16, or 19 of the 
incubation period. Meier used only 
animals which showed no gross mor- 
phological postural or locomotor de- 
fects. Several possible explanations 
may be invoked to account for the 
discrepancy between the rat and 
chick data. Meier believes that the 
rat data reflect not irradiation effects 
on the fetus per se, but rather 
"anoxia-like" conditions imposed 
upon the fetus through irradiation 
effects on the mother. It would seem, 
however, that this hypothesis does 
not account at all for the behavioral 
deficits observed in postnatally irra- 
diated Ss (Chesnokova, 1959; Furcht- 
gott et al., 1958; Levinson & Zeigler, 
1959) or the neurological and mor- 
phological changes in such Ss (Cle- 
mente et al., 1960; Kosmarskaya & 
Barashnev, 1958; Yamazaki et al., 
1960). Furthermore, Lengerova 
(1957) and Brent and McLaughlin 
(1960) in their studies irradiated the 
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embryos directly while shielding the 
mothers, and they also obtained mor- 
phological changes which were similar 
to those observed by others when the 
mother was irradiated. Brent and 
McLaughlin found that even a 1,400- 
r. dose to the mother alone had little 
effect on the embryos. Sharp's (1961) 
recent study provides the most telling 
blow to Meier's hypothesis. In one 
group of rats the whole pregnant 
animal was irradiated; in a second 
group the lower half of the body, in- 
cluding the fetus, was shielded; and 
a third group consisted of control Ss. 
On both a multiple unit T maze as 
well as on a locomotor coordination 
device (Furchtgott & Echols, 1958b) 
offspring from the second group did 
not differ from controls while the 
whole-body group offspring were 
significantly inferior. From this 
study one can safely conclude that 
indirect effects from the maternal ir- 
radiation do not contribute signifi- 
cantly to the observed behavioral 
deficits in rats. 

Finally, Degenhardt and Grüter 
(1959) compared the CNS terato- 
genetic effect of anoxia with those 
produced by X rays. They found 
that the critical period and magni- 
tude of damage were different for 
anoxia than for X rays. 

There are more plausible interpre- 
tations of Meier's data. First, it may 
be hypothesized that it takes a more 
drastic CNS insult to affect a chick's 
than a rat's behavior. This, of 
course, is the application of the well 
known encephalization hypothesis 
(Brady & Bunnell, 1960, p. 361) to 
the present problem. It should be 
also noted that Meier eliminated 
those Ss which appeared to have 
received the greatest morphological 
insults. Secondly, the basic struc- 
tural development of the chick's 
brain is completed by Days 4 or 5 of 
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incubation and the essential features 
are apparently completed by Day 12 
(Romanoff, 1960, p. 235). It is possi- 
ble, therefore, that, had Meier con- 
centrated his treatments more on 
the earlier stages of development, he 
might have found behavioral 
changes; i.e., perhaps he missed the 
periods of maximal sensitivity. 

Other Functions. Furchtgott and 
Echols (1958b) tested locomotor co- 
ordination in rats irradiated between 
Day 14 of gestation and neonatally. 
The S's task was to locomote with all 
four legs on two narrow parallel bars. 
The distance between the bars was 
increased step-wise up to the width 
which S could no longer negotiate, A 
dose of 50 r. was already effective in 
depressing the scores of Ss irradiated 
on Days 14-15. In general there was 
an inverse relationship between age 
at time of irradiation and the minimal 
effective dose. Subsequently, Furcht- 
gott, Echols, and Dees (1960) re- 
ported that the deficits may be ob- 
served already after 25 r. Sharp 
(1961) reported similar data for the 
same task. He tested his Ss at the 
age of 40, 90, and 140 days. Differ- 
ences between the control and irradi- 
ated groups were apparent at the first 
two testing ages, but not at the age of 
140 days since at that age even the 
performance of the control Ss was 
poorer than at the earlier age levels, 
ie., the task is not appropriate for 
testing older animals. ^ Wechkin, 
Elder, and Furchgott (1961) also ob- 
served deficits in the ability of rats 
irradiated on Days 16-18 of gesta- 
tion to climb up an inclined plane. 
The dose used was 200 r. Similarly, 
Werboff, Goodman, Havlena, and 
Sikov (1961) found several types of 
motor deficits in prenatally irradi- 
ated rats. 

Furchtgott and Echols (19582) 
noted changes in general activity 
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measured in tilting cages. The Ss 
irradiated fetally (Days 14-18) were 
hyperactive (the dose range tested 
was 100-300 r.) while neonatal irradi- 
ation induced hypoactivity. Piont- 
kovskiy and Kolomeitseva (1959) 
also reported hyperactivity in ani- 
mals which had received 300 r. on 
Day 18. 

Related to the tilting cage activity 
measurements are data obtained by 
Furchtgott and Echols (1958a) in an 
open field. Here again they found 
that prenatal irradiation in general 
enhanced locomotion, while postnatal 
treatment inhibited it. The maxi- 
mum enhancement occurred in Days 
16-17 Ss. Open-field performance 
changes may be considered to be a 
measure also of ''emotionality"— 
more specifically, of fearfulness. In 
the same study the authors also 
found that home-cage emergence 
which previous workers have used in 
measuring fearfulness (Anderson, 
1938; Willingham, 1956) is greatly 
depressed in irradiated animals. 
Similarly, Furchtgott, Murphree, 
Pace, and Dees (1959) observed that 
irradiated male pigs were slower to 
emerge from their living quarters 
than control Ss of a comparable age. 
In the same study only 3 out of 10 
irradiated pigs mounted an estrous 
gilt, while 22 out of 28 control group 
pigs mounted and achieved or at- 
tempted intromission (p<.02). The 
greater fearfulness measured in terms 
of inhibition of mating behavior was 
also tested in rats. In two separate 
experiments fetally irradiated rats 
(the doses ranged from 50 to 150 r. 
administered between Days 10 and 
20 of gestation) exhibited less copu- 
latory behavior than comparable con- 
trol groups. Further support for the 
fearfulness hypothesis in accounting 
for mating decrements in male rats 
was presented in another study 
(Hupp, Pace, Furchtgott, & Mur- 
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phree, 1960). The írequency of 
occurrence of copulation plugs when 
females were caged with males for 20 
days was the same in irradiated (150 
r.) as in control Ss, even though in 
4-minute tests in a standard mating 
cage significantly fewer irradiated Ss 
copulated than did those in the con- 
trol group. Thus the behavioral 
decrements in the latter situation 
were not primarily associated with an 
overall lower mating drive, but rather 
with the fearfulness of the irradiated 
Ss, resulting in a tendency not to 
approach the novel stimulus which 
consisted of the estrous female in the 
mating cage. It was also noted in the 
home cages that the initial appear- 
ance of copulation plugs was signifi- 
cantly later in irradiated than in con- 
trol Ss. 

We may conclude, therefore, that 
prenatally irradiated mammals ex- 
hibit behavior patterns which may be 
best described as fearfulness. We 
must remember, however, that what 
has been termed fearfulness is not a 
unidimensional trait even in sub- 
human species (Willingham, 1956). 

Tacker and Furchtgott (in press) 
found that the adjustment to a 22- 
hour food deprivation cycle is slower 
in rats which had received 100—200 r. 
of X irradiation between Days 14 and 
18 of gestation than it is in nonir- 
radiated control Ss. The Ss were 
tested at the age of 3 months, 1 
year, and 2 years. Since adjustment 
to food deprivation decreases with 
age, it might be hypothesized that 
the irradiated Ss exhibited more rapid 
aging than the control Ss. In an un- 
published study by Furchtgott and 
Elder similar data were obtained for 
a 22-hour water deprivation regimen. 


Human STUDIES 


A number of clinical case reports 
have again appeared during the past 
6 years in which neurological or be- 
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havioral irregularities were men- 
tioned (Bašić & Weber, 1956; Cour- 
ville & Edmondson, 1958; Lantvéjol, 
Hervet, & Petit, 1957). 

Itis noteworthy that Miller (1956) 
in reviewing the delayed effects oc- 
curring within the first decade after 
exposure of young individuals to the 
Hiroshima bomb noted that the only 
permanent defect of the children ex- 
posed in utero was microcephaly. 
The incidence and severity of micro- 
cephaly was a function of the distance 
of the victim from the hypocenter 
and gestational age. Mental retarda- 
tion occurred in almost half of the 
microcephalics. However, it should 
be pointed out that mental retarda- 
tion is an extreme form of deviancy. 
To what extent intellectual and other 
behavioral abilities of the victims not 
diagnosed as microcephaly were also 
depressed is, of course, unknown. 
Tsuiki and Ikegami (1956) also re- 
port six cases of mental deficiency in 
a group of 52 children who were ex- 
posed to the A-bomb. In this group 
of 52 a number of children also 
showed personality deviations on the 
Rorschach and on a Japanese auto- 
diagnostic questionnaire-type per- 
sonality test. Since comparable con- 
trol Ss were not tested, the personal- 
ity measures are difficult to evaluate. 
Izumi (1956) also found that 6 years 
after the Nagasaki blast exposed 
children were inferior to controls on 
an intelligence test and on tests of 
mathematical and language abilities. 
The following year, however, the 
group difference disappeared. No 
statistical evaluation of the results is 
presented nor does the author explain 
the basis for therapid recovery. More 
reliable studies of the exposed chil- 
dren would seem to be desirable. 

Wesley (1960) in an interesting 
paper boldly proposes that the in- 
crease in congenital malformations in 
the United States amounting to 


about 6% over the past 30 years is 
due to a corresponding increase in 
background irradiation. It would 
seem, however, that Wesley is put- 
ting all of his eggs into one basket; 
since during this 30-year period 
other environmental factors, such as 
the increased use of anesthetics dur- 
ing delivery and greater exposure of 
the mothers to exhaust fumes and 
other debilitating agents, may also 
have contributed to the reported in- 
crease in the anomalies. 

In addition, Wesley also correlated 
the incidence of congenital malforma- 
tions in various parts of the world 
with calculated background radia- 
tion for those localities. The r. was 
significant at .001. Kratchman and 
Grahn (1959) have also suggested 
that the mortality incidence from 
congenital malformations may be 
higher in those areas in the United 
States which contain the major 
uranium ore deposits, uraniferous 
waters, or helium concentrations. 

Discussion and Conclusions. It is 
quite apparent that mammals ir- 
radiated during their developmental 
period, pre- or postnatally, exhibit a 
variety of behavioral deviations. 
This is not surprising, since it has 
been well established that the de- 
veloping nervous system is highly 
radiosensitive. 

In the behavioral domains, which 
have been studied, the observed 
changes were a function of both the 
dose and the age at which radiation 
was administered. In the rat the age 
of maximal sensitivity for the de- 
velopment of changes'in behavioral 
domains varies, as it should if we 
assume that different neural and 
endocrine structures determine the 
different behaviors. Learning is max- 
imally affected following irradiation 
on Day 14 of gestation, or perhaps 
even earlier, open-field activity on 
Day 16 and male mating activity on 
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Day 20 (Hupp et al, 1960). In 
general the deficits were directly 
related to the quantity of radiation 
which was administered. 

'The approach, which of necessity 
always characterizes the initial stages 
of knowledge in an area, has been 
primarily taxonomic thus far. In- 
vestigators have been interested in 
determining whether certain kinds of 
behavioral phenomena are altered in 
organisms which are irradiated dur- 
ing their developmental period. Many 
behavioral domains such as sensory 
functions and perception, drives other 
than sex, and social behavior, to 
name a few, have not been studied 
as yet; and even in those domains, 
such as learning, in which data are 
available and functional relationships 
between the behavioral deficits and 
the important independent variables 
such as age at the time of irradiation, 
type and rate of delivery of radiation, 
and sex and species differences have 
been investigated, our information is 
scanty as yet. In this connection it 
would seem desirable to obtain addi- 
tional behavioral data on human .Ss 
such as the Japanese bomb victims. 

No attempts have been made so 
far to use prenatal irradiation as a 
tool in the study of behavioral prob- 
lems. It seems appropriate to indi- 
cate here some of the potential uses 
of this tool. 

We have previously mentioned al- 
ready Hicks’ (1958) advocacy of 
irradiation as a tool in neuroembry- 
ology. A logical extension of this 
approach is the correlation of the 
morphological development with be- 
havioral development. How does the 
absence of certain structures affect 
the development of various behav- 
ioral functions? 

Hebb (1949, pp. 292-294), among 
others, has emphasized the differences 
between early and late brain injuries. 
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He hypothesizes that some types of 
behavior require a larger amount of 
brain tissue for their first establish- 
ment than for their maintenance. 
Here we have, therefore, an excellent 
means for inducing the injuries at the 
earliest possible time, namely, before 
the structures are actually well 
formed. Since surgical ablations are 
not feasible during the embryonic 
stages and the passage of drugs 
through the placental barrier is not 
quantifiable, radiation would seem to 
be the method of choice for removing 
portions of the CNS in their forma- 
tive stages. 

Related to the previous problem is 
the analysis of the development of 
compensatory functions in the pre- or 
neonatally injured animals. Do the 
deficits which may be observed rela- 
tively early in the animal's lifespan 
tend to become fixed, or is there an 
amelioration or perhaps even an 
aggravation of the symptoms? 

Finally, since there have been 
several theories in which radiation is 
postulated as a “general aging" : 
agent, developmentally irradiated Ss 
may be used to study behavioral ag- 
ing phenomena, assuming of course 
that the radiation induced aging 
theory should be verified. 


NERVOUS SYSTEM IN THE ADULT 


The vertebrate CNS has always 
been considered to be less sensitive 
to ionizing radiations than are other 
bodily systems. Usually it takes 
doses at or above the whole-body 
LDso to induce neural changes. A 
1960 Committee report of the Na- 
tional Academy of Sciences states, 
“Doses in hundreds of roentgens 
seem to have little effect on adult 
nervous tissues, Recent reports that 
subtle functions of the brain are 
disturbed by doses of a few roentgens 
still await confirmation" (p. 31). 
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During the past 6 years there have 
been no studies published in the 
American or Western European liter- 
ature which would cast doubt on the 
previously mentioned hypothesis. 
This is especially true of morpho- 
logical investigations. Haymaker, 
Nauta, Sloper, Laquer, Pickering, 
and Vogel (1958) saw no evidence of 
meningitis in six monkeys exposed to 
1,000 r. In a group of 67 monkeys 
which received between 1,000-30,000 
r., neuronal damage to cortical cells 
was "uncommon, and could be at- 
tributed, as a rule, to ischemic 
changes related to intense vascu- 
litis" (p. 23). Some changes in the 
cerebellar granular cells were ob- 
served in Ss which had received 
2,500 r. or more. Perese, Murphy, 
and Parsons (1958) found no changes 
in the cerebellar Purkinje cells of mice 
which had received less than 2,670 r. 
Pourquier, Baker, Giaux, and Benir- 
schke (1958) reported that doses of 
4,500-18,000 r. were required for 
changes in the white matter of the 


. hypophyso-hypothalamic region of 


guinea pigs. These doses did not 
have any appreciable effects on 
cortical cells. 

Even somewhat lower doses, how- 
ever, apparently do produce latent 
changes. Bailey, Ingraham, and 
Bering (1958) inserted Tantalum 
182 (half-life of 117 days) into the 
parieto-occipital region of monkeys. 
The total dose delivered by the wires 
was 400-600 r. Microscopic studies 
conducted 3—33 months after comple- 
tion of the gamma radiation revealed 
definite necroses at the site of the 
irradiation. The authors believe 
that damage to the blood vessels was 
the major debilitating factor, but that 
a slight direct neuronal effect was also 
present. 

Physiological methods have. re- 
vealed changes at dose levels lower 


than those seen by morphological 
analyses. Nevertheless, even with 
these techniques the doses at which 
radiation effects were observed were 
in most instances at or above the 
whole-body LDso level. 

Winkler (1957) exposed the heads 
of 45 rabbits to 400, 600, or 800 r. of 
X irradiation. Even the lowest dose 
produced a disturbance of perme- 
ability of the blood-brain barrier, 
which however, returned to the nor- 
mal level after 6 days. Caster, 
Redgate, and Armstrong (1958) ob- 
served a decrease in cortical DNA 
after 700 r. of whole-body X irradia- 
tion (WBR) in rats. These changes 
paralleled decreases in electrocortico- 
gram potentials, persisting up to the 
time of death. Bailey et al. (1958) 
found marked voltage and frequency 
asymmetry in monkeys a few days 
after they had received 500 r. gamma 
irradiation resulting from the inser- 
tion of Tantalum 182 wires into the 
parieto-occipital region. On the ir- 
radiated side the voltage was de- 
creased while on the opposite side the 
predominant frequency was faster 
than before irradiation. The maxi- 
mum changesoccurred approximately 
6-8 weeks after irradiation, and then 
the patterns tended to revert back to 
normal About 3 months after the 
completion of the irradiation, the 
electroencephalograms (EEGs) ap- 
peared to be normal. As time passed 
some of the animals began to show 
again slow waves and decreased 
voltage on the irradiated side, and 
after about 2 years, 2 out of 12 
monkeys showed epileptic patterns 
with runs of spikes and very fast 
activity, localized in the region of the 
irradiation. In general the EEG pat- 
terns did not indicate any focal 
lesions, but rather general cerebral 
damage. Rübe (1959) observed no 
cortical EEG changes in guinea pigs 
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after 500 r. A 1,000-r. dose did pro- 
duce temporary changes lasting less 
than 2 days. 

Gangloff and Haley (1959) admin- 

istered 400 r. WBR to five cats. 
Immediately following the treatment 
spikes appeared in the hippocampus 
and amygdala. The peak of these 
potentials appeared 3-7 hours follow- 
ing the treatment. Three days later 
the records appeared to be normal. 
Low voltage fast waves could also be 
observed during this period. Reticu- 
lar arousal thresholds also decreased 
immediately following irradiation, 
but they returned to the normal level 
within about 4 hours. Thalamic re- 
cruiting thresholds increased 2-4 
days after irradiation and this change 
lasted apparently until the animals 
died 5-10 days following the treat- 
ment. In a subsequent report the 
same authors (Gangloff & Haley, 
1960) tested cats after 400-r. and 
200-r. WBR, 400 r. to the head only 
and 400 r. to the body only with the 
head shielded. There were sponta- 
neous spike discharges in the dorsal 
hippocampus of all whole-body Ss, 
including the 200-r. group, 30 minutes 
after exposure. 

Tanimura (1957) reported that 1-2 
days after X irradiation with 300 r. 
neurosecretory granules in the hypo- 
physial-hypothalamic system showed 
an increase compared to controls, and 
this was then followed by a decrease. 

Preliminary reports from German 
laboratories are contrary to the 
widely held view that the nervous 
system is relatively radioresistant. 
According to Hug (1958) X and 
alpha irradiation produces a reflex 
retraction of the feelers in various 
European land and water snails. The 
latency for the response was 5-15 sec- 

onds and the threshold dose for 
Helicella candidans was 1.5-2.0 r/sec, 
for Arion empiricorum 1.5-2.5 r/sec, 


and for Helix pomatia 15 r/sec. 
The latency was inversely related 
to the dose. The author did not 
give the number of Ss which he 
tested or the variability within each 
group. He did not know whether 
radiation acted as an adequate or 
inadequate stimulus, i.e., what was 
the sensory pathway which mediated 
this response. In another study 
Kroebel and Krohm (1959) found 
that a dose of 4-6 r. of alpha rays 
raised the chronaxie of excised frog 
sciatic nerves. Only the average 
curves for 12 control and 9 treated 
preparations were presented with no 
indication of the variability. Another 
graph (no mention is made of the N 
of the samples) actually presented a 
heightened chronaxie during the ir- 
radiation when the total dose was 
less than one r. The authors at- 
tributed the changes to the more 
rapid deterioration of the irradiated 
nerves. Born (1960) also reported 
similar preliminary results for the 
contraction of the mantle cavity in 
pulmonates. 

In contrast to these reports some 
Soviet workers have claimed that 
actually the CNS is extremely radio- 
sensitive. They have published very 
extensively in this area. This, of 
course, is not surprising since the 
Pavlovian school of physiology 
stresses the role of the nervous system 
in most bodily processes. Many of 
the studies are based on functional 
analyses of the organisms' response to 
irradiation. Fortunately for the non- 
Russian speaking reader, Stahl (1959, 
1960) has reviewed and critically 
analyzed most of the Soviet litera- 
ture, 

Stahl (1959) points out that there 
is no unanimity among Soviet re- 
searchers on CNS radiosensitivity. 
While some claim that changes may 
be detected following the admin- 
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istration of just a few r., or even at 
lower dose levels; others report data 
which are in essential agreement with 
those reported by a majority of 
Western scientists. One of the 
difficulties in assessing much of the 
Soviet literature is that either the 
experimental papers frequently omit 
details which are necessary for a 
critical analysis of the results, or the 
experiments themselves lacked cer- 
tain controls which according to our 
standards are essential for reliability. 
The following shortcomings of many 
Soviet papers are based on Stahl's 
evaluation (1959) and on the perusal 
of the literature by the present 
writer: (a) lack of statistical analysis 
of the data, use of very small samples 
which may obscure variability, and 
presentation of only "typical" re- 
sults; (b) inadequate description of 
instrumentation, procedures, and 
dosimetry; (c) limited description of 
the general conditions of the animals 
after irradiation (changes in CRs are 
reported without mentioning whether 
the animal was in good health or 
suffering from radiation malaise 
which includes among its major 
symptoms anorexia); (d) failure to 
report negative findings. 

The Soviet researchers have used a 
variety of techniques in assessing the 
effects of ionizing radiations on the 
nervous system. With reference to 
histological studies some investiga- 
tors claim (Aleksandrovskaia, cited 
by Stahl, 1960) that pathological 
changes in the CNS may be observed 
following a dose of 50 r., while others 
report data in essential agreement 
with those obtained by Western re- 
searchers, namely, a lack of histo- 
logical changes below the whole-body 
LD level (see Stahl, 1960). 

By far the largest effort, however, 
has been directed toward an analysis 
of functional changes embracing pri- 
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marily bioelectrical and conditioned 
response processes, (The latter will 
be discussed in the section on behav- 
ioral changes.) At the 1958 Geneva 
Conference, Soviet investigators em- 
phasized the fact that the frequently 
cited radioresistance of the nervous 
system has been based mainly on the 
analysis of gross histological studies 
rather than on subtle physiological 
changes (Livanov & Biryukov, 1958). 
In support of their contention that 
the CNS is highly radioresponsive, 
Lebedinsky, Grigoryev, and Demir- 
choglyan (1958)! reviewed several 
studies in which electroencephalo- 
graphic changes were observed in 
animals and man following doses of 
less than 5 r. and in some cases al- 
ready after less than one r. They 
cited Grigoryev's study in which in- 
creased bioelectrical currents were 
evident in the human cortex 30-120 
seconds after the start of irradiation 
of the head and abdomen following a 
total dose amounting to only 3-4 r. 
At times changes were apparent even 
in the first few seconds of irradiation 
when the dose amounted to only 
about oner. This temporary increase 
in cortical bioelectric activity was 
usually followed by a depression. 
Even more remarkable reports of 
cortical sensitivity were reported 
from a study of rabbits subjected to 
gamma irradiation at a dose rate of 
.13-.03 r. per second. Initial reac- 
tions to irradiation were observed in 
2 out of 22 cases after an exposure of 
only .05 r. and the maximum dose for 


2 To simplify the task of locating references ` 
the transliteration of Russian names as used 
by each translator was ed in this re- 
view. Since different translators used different 
systems, a certain amount of inconsistency 
appears in the spelling of some names. For 
references which are available only in Russian, 
a consistent system of transliteration was 
used. 
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the initial reaction was no more than 
1.4 r. 

Aside from measuring resting EEG 
potentials, reactivity was also studied 
in response to visual stimuli. Here 
again the Soviet workers found a 
temporary rise in excitability and 
lowering of the threshold for the 
EEG response which was then fol- 
lowed by a prolonged depression of 
excitability. 

Stahl (1960) in reviewing these and 
other studies points out some of their 
weaknesses, namely, lack of precise 
definition of the EEG changes (the 
records do not reveal a clear-cut spike 
or other measurable effects) and 
possible vibration and movement 
artifacts resulting from a startle re- 
sponse. Since the irradiation usually 
was not terminated after the changes 
following the small doses, the per- 
manence of the observed effects was 
not determined. With doses of 50 r. 
or higher, Lebedinsky et al. (1958) 
reported effects which were apparent 
for 2 weeks. 

Livanov and Biryukov (1958) hy- 
pothesized that with whole-body 
oses in the midlethal region or higher 
800 r. or more for mammals) there 
occurs increased electrical activity 
and a concomitant lowering of the 
threshold for stimulation in the sub- 
cortical centers. They cited experi- 
mental data from their own labora- 
tory in support of this hypothesis. In 
the cortex, however, the increased 
bioelectrical activity following ir- 
radiation lasts for a relatively short 
time (less than 1 hour) and is then 
replaced by decreased activity and in- 
creased thresholds. Normal levels are 
not attained for at least 7 days or 
longer. 

Stahl (1960) cited several animal 
and human studies of patients re- 
ceiving radiation therapy of the brain 
or localized peripheral areas in which 
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the initial phase following the appli- 
cation of the treatment, with the 
doses varying in the different studies 
from 5 to 2,000 r., was characterized 
by increased EEG activity and de- 
creased EEG responsivity to pe- 
ripheral stimuli. The brief hyper- 
excitability, however, was soon re- 
placed by a much longer period of 
depressed activity. Yanson (1957) 
subjected 13 rabbits to 500-r. WBR. 
He did not report specifically any 
initial EEG hyperactivity, but he did 
find the characteristic depression 
which was apparent after 15-20 
minutes following the completion of 
irradiation. He reported similar find- 
ings in animals in which either the 
head only or the body only was ir- 
radiated. 

Smirnova (1959) subjected cats to 
12- to 600-r. WBR and she recorded 
resting electrical potentials in the 
hypothalamus and changes to audi- 
tory stimuli (50-cps tone). For the 
first 3 hours following irradiation 
there was an increase in hypotha- 
lamic excitability measured in terms 
of lowered thresholds for the induction 
of potential changes. There was also 
a general increase in sympathetic ex- 
citability, measured in terms of 
changes in skin potential and respira- 
tion. 

Soviet workers have also studied 
the effects of radiations on various 
unconditioned reflexes and motor 
chronaxies (Stahl, 1959). . The re- 
sults have been equivocal. While 
some investigators observed signifi- 
cant increases in response time after 
a dose of only a few r., others 
placed the threshold dose at several 
hundred r., and in one study no 
changes in the motor chronaxies of 
the spinal cord were observed even 
after a dose of 600 r. A good illustra- 
tion of the problems of interpreting 
some of the Soviet research in this 
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area is provided by the study of 
Poplavskiy (1956). Doses between 40 
and 320 r. produced decreases in 
reflex activity in mice in experiments 
performed in the spring, while the 
same doses produced a temporary 
increase in reflex activity lasting for 
24 hours in experiments conducted in 
the fall. Poplavskiy assumed that 
seasonal factors affected reflex re- 
sponses. 

Several studies have also dealt 
with biochemical changes in the 
brains of irradiated animals (Kosya- 
kov, 1959, 1960; Pomazanskaya, 
1960). 7 


Discussion and Summary 


In general, histological studies tend 
to support the long-held view that 
adult neural tissues are relatively in- 
sensitive to ionizing irradiations. 
Some recent American studies actual- 
ly tend to indicate that they are even 
less sensitive than was heretofore 
believed (Zeaman, Curtis, Gebhard, 
& Haymaker, 1959). 

It would seem that research deal- 
ing with gross histological studies at 
low or moderate dose levels, those 
below whole-body LDso, is definitely 
on the decrease. This is to be ex- 
pected, since a considerable amount 
of evidence has accumulated that 
gross morphological changes are not 
discernible at such levels of radiation. 
The newer histochemical techniques 
in which various CNS metabolites 
and constituents are assayed, such as 
brain catalase, cholinesterase, DNA, 
serotonin, etc., have not been em- 
ployed sufficiently extensively as yet 
to permit an evaluation of the more 
subtle changes. 

There has been increasing interest 
in measuring  electrophysiological 
changes in irradiated animals. In 
general, temporary alterations may 
be observed at dose levels which are 


lower than those producing morpho- 
logical effects. Most Western scien- 
tists have reported changes following 
doses of several hundred r. No recent 
study has been concerned specifically 
with threshold dose determinations. 
It is also not certain whether the ob- 
tained results are attributable pri- 
marily to direct neuronal damage or 
are secondary manifestations associ- 
ated with diverse metabolic changes 
in the vascular system. 

In contrast to the relative paucity 
of systematic studies emanating from 
Western laboratories, Soviet workers 
have dealt extensively with the 
problem of radiation changes in the 
nervous system focusing especially 
on functional analyses, namely, on 
electrophysiological effects and un- 
conditioned and conditioned reflexes. 
The results of these studies are in- 
conclusive. While some investigators 
reported that functional changes may 
be detected after 1 r. or less, others 
reported findings very much in line 
with those of Western workers. It is 
difficult to evaluate many of these 
studies since the reports either are 
incomplete or the experimental de- 
signs do not permit unambiguous 
conclusions. 


LEARNING 
Soviel Research 


Soviet researchers working within 
the framework of Pavlovian physi- 
ology have dealt extensively with the 
problem of conditioning in irradiatec 
organisms. Indeed, according to one 
of the major Soviet spokesmen 
(Lebedinsky, 1958) in the field: 


The results of experimental work show that 
the functional condition of all parts of the 
central nervous system undergoes à ape 
even when administered | 

is due to a wide use of Pavlov's method of — 
conditioned reflexes (p. 8). | od 


172 ERNEST FURCHTGOTT 


The well known physiologist Orbeli 
also believes that CNS changes may 
be detected by the CR method when 
no symptoms of the radiation syn- 
drome are present (cited by Stahl, 
1960). Soviet investigators do not 
seem to agree on the minimal effec- 
tive dose which produces detectable 
CR changes. Ivanitskiy (1959b), in 
summarizing the results of a Soviet 
conference on the effects of ionizing 
radiation on the higher functions of 
the nervous system, reported that 
some studies have shown functional 
changes which last for a definite 
period after doses of only .5-15 r. 
These changes are most pronounced 
in internal inhibitions (differentia- 
tion, delayed reaction, and condi- 
tioned inhibition). Chronic dis- 
turbances are evident in animals that 
do not exhibit other signs of radiation 
sickness. 

Samoylova (1959) tested instru- 
mental conditioning in eight ir- 
radiated and eight control rats with 
food as the reward. The experimental 
Ss received .1 r. every other day for 
a period of 4.5 months. She claimed 
that in rats of a certain "type of 
nervous system" a cumulative dose 
of only 1.2 r. was sufficient to alter 
responses, and these effects became 
more pronounced after 5 r.; the 
changes persisted for 1 year after the 
cessation of exposure (the total dose 
was only 7 r.). 

Gorsheleva (1960) measured trace 
CRs to visual stimuli in eight rats 
which had received a single 5-r. ex- 
posure of X irradiation. She claimed 
that disturbances in the inhibitory 
phase, where Ss responded during the 
trace period, were noticeable in some 
cases 30 minutes after treatment and 
that they lasted for 21-25 days. De- 
tailed data are presented only for two 
typical Ss. Examination of these 
data, however, seems to indicate that 


the within-S variability both beíore 
and after irradiation was probably 
not much smaller than the before- 
after treatment difference. No con- 
trols were tested to observe changes 
merely as a function of the continua- 
tion of the testing procedure. Makar- 
chenko and Zlatin (1959) exposed 
four dogs to .05 r. per hour 6 days per 
week. In "strong" and "intermedi- 
ate" types positive CRs declined 1 
month after exposure. In “weak” 
types the decline was preceded by an 
enhancement. After 8.5-9.5 months 
of exposure all dogs showed increased 
CRs. No specific data were presented, 
only typical graphs. 

Yarullin (1959) tested conditioned 
avoidance and respiratory responses 
in three dogs and found changes in 
latency after a single exposure of 15 
r., but the changes were cyclical and 
undulating, so that even after a dose 
of 765 r. in a weak type and 1,335 r. 
in a strong type the responses re- 
sembled those of normal dogs. 
Kotlyarevskiy, Gorsheleva, and 
Khozak (cited by Stahl, 1960) re- 
ported that conditioning was en- 
hanced in rats which received three 
50-r. doses spaced 1 to several weeks 
apart. Doses above 150 r., on the 
other hand, produced performance 
decrements which lasted several 
months. Stahl (1960) analyzed the 
data statistically, finding that if 
we accept the usual criterion of 
significance (.05 level), the author's 
conclusions were not warranted. 
Malyukova (cited by Stahl, 1960) 
found that conditioned avoidance 
responses in a Miller-Mowrer type 
apparatus extinguished twice as 
rapidly in mice after a 180-r. dose 
than prior to irradiation. (From the 
description by Stahl it would seem 
that the comparison was between 
original pre- and postirradiation ex- 
tinction which occurred after recon- 


EFFECTS OF IONIZING RADIATIONS 


ditioning. It is questionable whether 
such comparisons are justifiable.) 
There have been a large number of 
studies in which doses of 300 r. or 
more were used. Yanson (1957) 
reported the results of 636 experi- 
ments on five rabbits which received 
500 r. of whole-body irradiation and 
eight Ss with head only or body only 
treatment. Trace conditioned avoid- 
ance responses using a flashing light 
as the conditioned stimulus (CS) and 
electric shock to the hind leg as the 
unconditioned stimulus (UCS) were 
employed. One hour after WBR, CRs 
were absent even though the cortical 
electric potentials were still present 
or even stronger than prior to treat- 
ment. Restoration of the CRs took 
place gradually between Days 3-7 
following irradiation. However, the 
newly established CRs were erratic; 
at times they would appear, and at 
other times they were absent during 
the next 1 or 2 weeks. Irradiation of 
the head only produced changes 
which were similar to those observed 
following whole-body irradiation. 
Exposure of the body only produced 
depression or absence of the CRs for 
only 1-2 days following irradiation. 
Restoration of the CRs occurred on 
Days 5 and 6 after exposure. From 
concurrent studies of the bioelectric 
- potentials, the author assumed that 
the inhibition of the CRs was associ- 
ated with decreased activity in the 
cortical visual and motor areas. The 
earliest changes in the reflex are ap- 
parently in the central (supraseg- 
mental) part of the chain since shield- 
ing the head resulted in a consider- 
` able delay in the changes. Only 
graphs of typical changes were pre- 
sented. 

Piontkovskiy, Miklashevskiy, and 
Meyerson (1957) applied 600 r. of 
gamma irradiation to the whole-body, 
head-only, or stomach region only of 
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albino rats. The Ss’ instrumental 
CRs were then tested. — Four se- 
quences of changes were observed: 
period of initial changes, period of 
maximal disturbances, period of res- 
toration, and period of aftereffects. In 
the locally irradiated Ss the initial 
changes were minimal, merging with 
the period of maximal activity. In 
the whole-body irradiated group Pe- 
riod 1 wascharacterized by an increase 
of positive CRs, decreased latency, 
and disinhibition of differentiations. 
During Period 2, which set in 24-48 
hours following treatment and lasted 
3-13 days, CRs were neither dimin- 
ished nor inhibited. During Period 3 
the responses tended to return to 
their normal level, but a great 
deal of instability (low or high fre- 
quency of responses, disinhibition of 
differentiation, etc.) was present. 
This period lasted from 35 to 45 days. 
In the final phase, which lasted up to 
10 months, the periods of undulation 
which were observed in Phase 3 were 
of even longer duration. No specific 
data were presented except graphs of 
typical Ss. 

Belokonski (1959) studied escape 
responses in rats using electric shock 
as the UCS and sound as the CS. 
With 350-700 r. both CRs and un- 
conditioned responses — (UCRs) 
showed heightened activity. With in- 
creases in the dose level some animals 
displayed inhibition of activity, while 
others an initial excitation alternat- 
ing with inhibition. No statistics 
were presented. Meshchersky (cited 
by Stahl, 1960) irradiated the cortex 
of rabbits directly with 25-200 r. of 
gamma or X rays. The strength and 
accuracy of CRs increased, differen- 
tiation was improved, and an in- 
crease in latency and of random 
motor activity between trials was 
also noticeable. No statisticalanaly- — 
sis of the data is available. Meizerov 
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(1959) applied 3- to 15-r. WBR to 
three dogs up to a cumulative dose 
of 777-1,470 r. The Ss were observed 
for 2.5 years. The CRs with food as 
the reinforcer showed decrements 
before the appearance of radiation 
sickness. However, at times the re- 
sponses were normal. When the dose 
reached 500 r. responses to the CS 
became very weak. Despite the 
author's claims that UCRs were 
unaffected, gastrointestinal disturb- 
ances which are associated with ioniz- 
ing irradiation probably interfered 
with the testing. 

In contrast to these studies in 
which CR changes were observed 
with doses at or below the whole- 
body LD, level, some Soviet investi- 
gators report that much higher doses 
are required to demonstrate altera- 
tion in CR activity. 

Lomonos (1957) reported that 400 
r. did not affect conditioned vomiting 
responses in dogs. Korol’kova (1959) 
found that even 2,200 r. did not in- 
hibit the formation of new CRs, and 
Livanov and Biryukov (1958) re- 
viewed several studies on hens and 
pigeons where doses between, 2,500 
and over 4,000 r. did not produce 
permanent changes in the stability of 
CRs or differentiation despite the 
fact that optokinetic changes, indi- 
cating brain-stem damage, could be 
observed. The authors then com- 
pared the findings in birds with a 
specific experiment performed on four 
dogs in which a 400-r. dose did pro- 
duce decrements in CRs following the 
onset of radiation sickness. They 
(Livanov & Biryukov, 1958) con- 
cluded: “The role of certain nervous 
formations in the reactions of the or- 
ganism to irradiation can be truly 
evaluated only if it is approached 
Írom an evolutionary aspect" (p. 
279). Thus they stressed phyloge- 
netic factors in radiation sensitivity. 
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A study by the Czech researcher, 
Karpfel (1958), also belongs in this 
section. He found that with 12 rab- 
bits survival, loss of body weight, 
and white blood cell changes after 
700-r. WBR were related to condi- 
tioning. The Ss that showed more 
rapid conditioning and discrimina- 
tion were less affected by the irradi- 
ation than the slower animals. 

Zhuk (1957) tested CRs using the 
Ivanov-Smolenskiy technique with 
12 workers who had been exposed to 
varying doses of ionizing radiations. 
He reported that they differed in con- 
ditioning and extinction from nor- 
mals. However, no specific data were 
presented. 

Conclusions. We have previously 
discussed some of the shortcomings of 
Soviet research reports. "There are 
several additional weaknesses espe- 
cially pertinent to the analysisof CR 
data: 

1. Several Soviet studies have re- 
ported that irradiation alters various 
types of unconditioned reflexes (Fedo- 
rova, 1958; Komarov, 1957; Kudrit- 
skiy, 1955). Yet many of the CR 
studies did not measure or even men- 
tion UCR changes. Thus, the ob- 
served changes in conditioning might 
be ascribed to motivational rather 
than associative factors. 

2. Itis well known that irradiation 
depresses food intake for several 
days, and body weight does not re- 
turn to the normal level for even a 
longer period (Furchtgott, 1956). 
Nevertheless, a number of studies 
employing hunger drive in instru- 
mental conditioning have completely 
ignored the drive variable. Here 
again the observed changes in per- 
formance may be primarily reflec- 
tions of changes in drive. 

3. Several of the studies used a 
self-control design in which CRs fol- 
lowing irradiation were compared 
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with the preirradiation responses, 
with no control Ss tested. It is ques- 
tionable, therefore, whether the 
changes are ascribable solely to ir- 
radiation. 

4. Since with low doses the re- 
ported changes were frequently small, 
it would seem necessary to control 
the pre-experimental histories of the 
Ss. However, this was usually not 
mentioned, or Ss were used which 
actually had diverse pre-experimen- 
tal histories (cf. Lomonos, 1957). 

5. Quantitatively different changes 
were reported for animals having 
different "types of nervous systems" 
(Makarchenko & Zlatin, 1959; Sa- 
moylova, 1959; Yarullin, 1959). The 
investigators did not state whether 
the typologies were determined prior 
to irradiation or whether they are ex- 
planatory concepts arrived at post 
hoc to account for the large within- 
group variability. 

6. There is no agreement as to 
lowest effective dose. Some investi- 
gators reported changes after few r., 
even less than 1 r., while in other 
studies the doses varied anywhere 
from 50 to 2,000 r. There have been 
no attempts to determine systemati- 
cally the threshold dose, or for that 
matter, few studies reported negative 
findings, i.e., dose levels at which no 
CR changes were observed. Further- 
more, comparisons between studies 
are difficult since different species 
and/or different experimental para- 
digms were used. 

7. The temporal duration and 
permanence of the changes were usu- 
ally not stated. 

'The Soviet research on condition- 
ing should be regarded as exploratory 
in nature. Since there have been no 
Western studies on classical condi- 
tioning, the Soviet work serves as a 
guide to those interested in the 
problem. 
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Most of the studies with sublethal 
doses reported increased between-ex- 
perimental-sessions variability rather 
than consistent lowering of response 
rates. [n many instances the effects 
were ascribed to changes in internal 
inhibition manifesting themselves in 
losses of stimulus differentiation, 
conditioned inhibition, and delayed 
reaction. It should also be noted that 
many of the reported changes were 
relatively small, evident only during 
prolonged intensive testing of a 
few Ss. 

To this reviewer it would seem that 
the results are explicable in terms of 
performance rather than by associa- 
tive changes. There is no evidence 
that the rate of CR acquisition or 
differentiation was affected. The 
fluctuating character of the changes 
also would tend to implicate perform- 
ance rather than associative factors. 


Western Research 


In the last review (Furchtgott, 
1956) no evidence was presented 
which would indicate loss of acquisi- 
tion per sein animals exposed to ioniz- 
ing radiations. However, perform- 
ance may be affected, and thus in 
certain situations learning may also 
be impaired because of changes in 
nonassociative factors. J 

Blair (1958) subjected rats to 
5,000 r. of cranial X irradiation and 
then tested their acquisition of a 14- 
unit multiple T maze either immedi- 
ately, after 30 days, or after 60 days 
following treatment. The irradiated 
Ss learned the maze in fewer trials 


than did the controls. Blair assumed 


that the irradiated Ss had a lesser 


tendency to explore or that they were | 


more motivated due to a greater loss 
in body weight. It would seem, how- 
ever, that the latter hypothesis ap- 
plies only to the Ss tested 30 or 60 
days after irradiation; yet the 60-day 


k 


176 


group seemed to be poorer in terms 
of trials and errors than the other two 
irradiated groups. If anything, those 
treated immediately following irradi- 
ation should have been less moti- 
vated, since food intake and body 
weight decrease during Week 1 fol- 
lowing irradiation. Fields (1957) ex- 
posed rats to 0-, 100-, 200-, 400-, or 
600-r. WBR before or after Trials 5 or 
10 of learning a 40-unit elevated T 
maze. Error scores were the same for 
all groups. The author reported some 
differences in the number of Ss that 
reached the criterion of three con- 
secutive errorless trials which favored 
the controls, although no statistical 
analysis was presented. Furthermore, 
the author did not control the decre- 
ment in hunger motivation in the ir- 
radiated groups. In another experi- 
ment rats were tested 3 months after 
exposure to 600 r., with some Ss re- 
ceiving an additional 200 or 400 r. for 
total doses of 600, 800, or 1,000 rina 
5-unit vertical V maze varying in com- 
plexity. In terms of errors, only the 
1,000-r. group differed significantly 
from the controls or the irradiated 
groups. The author admitted that 
the Ns per group were small and that 
there were also differences in running 
time which could have accounted for 
the results. The report did not indi- 
cate when the additional irradiations 
were administered. If they occurred 
just prior to the experiment, we may 
again assume that decrements in moti- 
vation affected learning in the irradi- 
ated groups. Field's article is too 
sketchy for a proper evaluation of the 
results. Jarrard (1958) exposed rats 
to a Co-60 source with one group re- 
ceiving 539 r. and a second group 
230 r. The Ss were tested 7 or 37 
days following treatment in a Lash- 
ley III water maze. The data showed 
very large variances for the irradi- 
ated as well as control Ss, and no 
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overall radiation effects on learning 
were discernible. : 

In contrast to these studies report- 
ing negative findings, Urmer and 
Brown (1960) trained 20 rats on a 14- 
unit elevated T maze, and after the 
Ss reached a criterion of one or zeró 
errorless trials half were exposed to 
400 r. of gamma radiation and the 
other half were kept as controls. The 
Ss were then tested on different maze 
patterns. The irradiated group made 
significantly more errors and reached 
the criterion of one or no errors per 
trial more slowly than did the con- 
trols. Recently this author (unpub- 
lished study) attempted to replicate ' 
Urmer and Brown's (1960) experi- 
ment with 11 control and 11 X-irradi- 
ated Ss. He failed to find any error 
or trials to criterion differences be- 
tween the groups. 

Harlow and Moon (1956) exposed 
rhesus monkeys to 100 r. of X irradia- 
tion every 35 days until death and 
tested them in the intervals between 
irradiations on a modified Hamilton 
search test, successive discrimina- 
tions, oddity, and delayed response 
tests. With doses varying from 100 to 
800 r. there were no performance 
decrements. On the oddity and de- 
layed response tests the irradiated Ss 
were superior to controls, probably 
due to lessened general activity and 
distractibility. Similarly, Riopelle, 
Grodsky, and Ades (1956) tested 
rhesus monkeys following exposure to 
350, 1,000, and 2,000 r, (N —4 per ir- 
radiation group) and two groups of 
nonirradiated controls on six different 
discrimination problems in the Wis- 
consin General Test Apparatus, elec- 
tric shock avoidance conditioning, 
and a spatial delayed response test. 
In the discrimination tests the irradi- 
ated Ss were equal to or superior to 
the controls (on two tests the differ- 
ences were significant), while on the 
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other two problems there were no 
differences. 

Ín a series of experiments Brown 
and co-workers studied discrimination 
and oddity learning, discrimination 
reversals, and delayed responses in 
thesus monkeys subjected to mixed 
gamma-neutron irradiation. With 
doses up to 616 rep (roentgen-equiva- 
lent-physical) from a mixed source, 
Brown, Overall, and Gentry (1959) 
found no effects in solving a series 
of 12 object-quality discriminations 
involving a novel stimulus in each 
problem. Two years following ex- 
posure monkeys receiving up to 
27-54 neutron rep plus 284-557 
gamma r. did not differ from con- 
trols in initial training and subse- 
quent reversal of an oddity problem 
(McDowell & Brown, 19593). In 
another study, the same Ss were 
studied in a spatial delayed response 
situation and again no differences 
were discernible between the control 
and irradiated Ss (McDowell & 
Brown, 1959b). In a simple discrimi- 
nation task under reduced cues the Ss 
that received a high dose (27-54 neu- 
tron rep plus 284-557 gamma r.) 
made significantly fewer errors than 
control Ss, while low-dose animals 
(10-16 neutron rep plus 70-140 
gamma r.) did not differ from con- 
trols (McDowell & Brown, 19582). 
In another discrimination study with 
monkeys exposed to mixed gamma- 
neutron radiation up to 670 rem 
(roentgen-equivalent-man) it was 
again reported that high-dose Ss were 


* The studies were initiated by Roger T. 
Davis. His findings were essentially the same 
as those reported by Brown and McDowell. 
Some of his data appeared in United States Air 
Force Report No. 57-59 and some data were 
presented at the International Symposium on 
the Response of the Nervous System to Ioniz- 
ing Radiations, Northwestern University, 
1960. 


superior to low-dose and control ani- 
mals (McDowell & Brown, in press). 

McDowell, Brown, & White (1961) 
tested monkeys who had received 
6,000 r. of X irradiation either to the 
frontal or the posterior cortical as- 
sociation areas 1.5 years prior to the 
experiment involving an oddity re- 
versal and a delayed response prob- 
lem. In both the original and re- 
versal learning the posterior group 
required fewer trials than did either 
the control or the frontal irradiated 
Ss, while the latter were inferior to 
the controls, On the delayed re- 
sponse test there were no differences 
between either the irradiated groups 
and controls. 

According to this group of investi- 
gators, superior learning or per- 
formance by irradiated animals can 
be ascribed to their lessened distracti- 
bility or responsivity to novel stimuli. 
McDowell (1958) observed rhesus 
monkeys in their home cages, 18 
of which had received 60-160 rem, 
16 received 300-620 rem, and 10 
were control animals. High-dose Ss 
made significantly fewer responses to 
extraneous uncontrolled noises than 
did the controls. During training in 
the Wisconsin General Testing Appa- 
ratus, controlled distracting stimuli 
consisting of a flashing light, noises, 
and music were introduced. The con- 
trol Ss responded more to the dis- 
tracting stimuli than did the irradi- 
ated Ss. Another variable stressed by 
these investigators and related to 
lower distractibility was narrowing 
of the span of attention. Another 
corollary of this hypothesis is that 
irradiated animals are less likely to 
perceive stimuli which are irrelevant 
to the task at hand than are normal 
animals. Data in an experiment in 
which two successive discriminations 
were performed, with some groups 
having the discriminanda of the 
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second task present during training 
on the first task, are interpreted by 
Brown, Carr, and Overall (1959) in 
terms of this hypothesis. Examina- 
tion of the results presented only 
graphically shows that the data were 
very complex, and, indeed, the au- 
thors base their interpretation on a 
significant triple-interaction obtained 
in an analysis of variance. 

Another attempt by McDowell 
(1960) to test the hypothesis of nar- 
rowed attention by measuring trans- 
fer of a learned discrimination along a 
peripheral cue gradient, the stimuli 
in a Wisconsin General Test Appara- 
tus were displaced on both sides from 
the central foodwell, failed to pro- 
duce unambiguous results. Low- and 
median-dose Ss showed greater trans- 
fer than did controls, while high-dose 
Ss showed less transfer. To account 
for the results McDowell had to re- 
sort to a complex hypothesis involv- 
ing, in addition to narrowing of 
attention, also an interacting change 
in drive state factor. Monkeys ex- 
posed to 6,000 r. of X irradiation de- 
livered to either the anterior or pos- 
terior cortex were also tested on the 
same problem. Here the perform- 
ance of the control and irradiated Ss 
was essentially the same (McDowell 
& Brown, 1959c). 

Davis (1961), however, reported 
data which would indicate that ir- 
radiated monkeys are more distracti- 
ble than control animals, Seven rhe- 
sus monkeys that had previously re- 
ceived 1,100-r. WBR in three doses 
spaced 1 year apart were tested with 
six control Ss on an object-discrimi- 
nation task. On odd numbered trials 
only a rewarded and a nonrewarded 
stimulus-object were present, but on 
even numbered trials three addi- 
tional superfluous objects were also 
introduced. In the last part of the 
learning series the irradiated Ss made 
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more errors than the control group 
on the even-numbered trials, but not 
on the odd-numbered trials. 

Overall, Brown, and Gentry (1960) 
trained five groups of rhesus monkeys 
representing radiation dose levels 
varying from 0 to 616 rep of mixed 
gamma-neutron radiation on an in- 
termediate size discrimination prob- 
lem. The problem could be learned 
using either "absolute" or ‘“‘rela- 
tional" cues. A test of transposition 
was then employed to determine the 
strength of relational learning. There 
was an inverse relationship between 
radiation dose and the amount of re- 
lational learning. 

DiMascio, Azrin, Fuller, and Jetter 
(1956) tested delayed response per- 
formance in dogs exposed to 150 r. and 
300 r. of X radiation. Six of eight Ss 
that had received 300 r. died soon 
after the completion of the experi- 
ment. During a period in which the 
300-r. group exhibited clinical signs 
of radiation illness, its performance 
was diminished; this the authors 
ascribed to decreased attention. The 
150-r. Ss did not differ from controls. 


Summary 


It is apparent that doses up to 
several thousand r. do not interfere 
with learning of relatively difficult 
problems whenever radiation malaise 
or motivational changes do not in- 
hibit performance. Actually, the 
evidence points in the other direction. 
Several groups of investigators work- 
ing with rats as well as monkeys 
have demonstrated that in many 
types of problems irradiated Ss learn 
more rapidly than do controls. The 
simplest explanation of the superior 
learning in the irradiated Ss takes 
into account their decreased dis- 
tractibility and narrow “focus of 
attention." It is quite possible that 
these variables basically are depend- 


WT" 


EFFECTS OF IONIZING RADIATIONS 179 


ent on decreased general activity in 
irradiated organisms. 


RETENTION 


The effects of irradiation on the 
long term retention of habits would 
seem to be of great interest, since it is 
well known that radiation may pro- 
duce chronic CNS changes with 
latencies ranging from months to 
years. Unfortunately, the general 
paucity of long term behavioral stud- 
ies applies also to the area of reten- 
tion. 

Blair and Arnold (1956) first 
trained rats in a 14-unit multiple T 
maze. After reaching the learning 
criterion, the animals were exposed 
to 2,500 r. of cranial X irradiation, 
In the first part of the experiment, 
retention trials were run on Days 3, 
12, and 25 aíter treatment; and in 
the second part, additional retention 
trials were conducted on Days 40, 60, 
and 80. On the third postirradiation 
day the controls tended to perform 
better than the experimental Ss in 
terms of errors and running time, but 
by Day 25 the reverse was true, and 
the irradiated Ss made significantly 
fewer errors and had shorter running 
times, The authors explained their 
findings in terms of the greater 
hunger drive of the irradiated .Ss 
from Postirradiation Day 20 onward 
and the lessened tendency of these Ss 
to "explore" the maze. They tested 
the hunger drive of the Ss in a Jen- 
kins-Warden obstruction box and 
found a higher frequency of crossings 
in the treated Ss. The “exploratory” 
hypothesis was based on observations 
of the Ss and the lowered variability 
of running times in the treated group. 

Davis, McDowell, Deter, and Steele 
(1956) first trained 16 rhesus mon- 
keys on six manipulation tasks, four 
object-quality discrimination learn- 
ing tasks, a delayed response task, an 


oddity-principle problem, and a re- 
duced-cue discrimination problem. 
Following intensive training on these 
tasks, 10 Ss were exposed to 400-r. 
WBR and 6 Ss were kept as controls. 
Only transitory performance changes 
corresponding to the period of radia- 
tion malaise could be detected. The 
same animals were retested approxi- 
mately 4 months later and the per- 
formance of the irradiated Ss was 
similar to that of the controls (Davis, 
McDowell, Grodsky, & Steele, 1958). 

Harlow, Schrier, and Simons (1956) 
tested two Java monkeys on several 
discrimination, oddity, and delayed 
response problems prior to a 62-hour 
exposure to a 90,000-foot altitude 
flight. Retesting immediately after 
exposure and again approximately 4 
months later revealed no losses of the 
acquired habits. 


Summary 

It is apparent that ionizing radia- 
tions do not produce losses in learned 
habits if performance is tested soon 
after irradiation or within a few 
months afterwards. The latter peri- 
od, however, may be too short for 
long-latency degenerative CNS 
changes to affect well established 
habits. 


PERFORMANCE 


The maintenance of acquired be- 
havior or performance is mainly a 
function of motivational or drive 
variables. Since drives are dependent 
on a variety of internal physiological 
mechanisms, radiations are going to 
alter performance whenever they af- 
fect one of those mechanisms. For 
example, in the previous section on 
learning it was pointed out several 
times that alterations in food intake 
which occur in several mammalian 
species immediately following whole- 
body exposure to ionizing radiations 
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influence performance during this 
period as well as subsequent ones, the 
latter being frequently a compensa- 
tory period of increased food intake. 


Radiation as a Drive Stimulus 


Garcia, Kimeldorf, and Koelling 
(1955) reported that gamma irradia- 
tion may be used as a UCS in the 
establishment of a conditioned avoid- 
ance response. Investigators at the 
United States Naval Radiological 
Defense Laboratory and at other 
places have studied this phenomena 
quite intensively. Recently Garcia, 
Kimeldorf, and Hunt (1961) reviewed 
this literature. The present summary 
is based mainly on their work. 

Two types of experimental para- 
digms have been utilized with this 
technique: (a) saccharin or other 
highly preferred substance when pres- 
ent during irradiation will acquire 
aversive characteristics and (b) ani- 
mals will avoid the compartment or 
box where irradiation occurred. The 
effects have been produced by X and 
gamma rays as well as by fast neu- 
tron bombardment. Rats, mice, and 
cats have all shown the conditioned 
aversive response, as well as monkeys 
(Harlow, 1962). The minimum 
effective dose rate depends on the 
rate of delivery, species, and prob- 
ably other factors, and under certain 
conditions it may be as low as 10 r. 
Aversions to the intake of fluids as- 
sociated with irradiation have been 
established already with a dose of 
only 10 r. Spatial avoidance behav- 
ior, on the other hand, requires doses 
at least 10 times as great or a total 
dose of 100 r. or greater. Ophthalmec- 
tomy has ruled out visual cues; and 
other studies have demonstrated that 
noxious odors, ozone, or nitrous oxide 
(Andrews & Cameron, 1960) are not 
the major variables underlying the 
phenomenon. 
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Garcia, Kimeldorf, and Hunt 
(1960) ascribe the phenomenon to 
changes in diverse internal neural 
and humoral factors, primarily gas- 
trointestinal functions. They found 
that irradiating the abdomen is more 
effective in producing aversions than 
irradiation of the head, pelvis, or 
thorax. Furthermore, food aversions 
are much more readily established 
than spatial aversions. Overall, 
Brown, and Logie (1960) also pre- 
sented data which may be most 
readily explained in terms of altered 
internal physiological conditions ra- 
ther than in terms of visual percep- 
tual mechanisms. 

Arbit (1958) has suggested using 
radiations as an UCS in avoidance 
conditioning studies. Radiation has 
the interesting property of not being 
mediated through the external sen- 
sory channels, and thus it might be 
considered as an "unconscious" 
stimulus. 

A study by Davis (1958) in which 
he found complex changes in the food 
preferences of monkeys 14 months 
after irradiation by a mixed neutron- 
gamma source might be related to the 
aversive conditioning phenomenon 
described previously. 


General Activity Changes 


In a continuation of the work of 
Kimeldorf on general activity, Cas- 
tanera, Jones, and Kimeldorf (1959) 
reported decreased activity in tam- 
bour and spring supported cages in 
rats, guinea pigs, and hamsters fol- 
lowing irradiation. Fields (1957) 
found decreased activity wheel per- 
formance for 2 days following ex- 
posure to 360 r. of X rays. A 180-r. 
dose produced no changes. In con- 
trast to these data Koch and Klemm 
(1960) found that spontaneous ac- 
tivity in mice is not affected by 500-r. 
whole-body dose, but a 700-r. dose 
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produces a depression, primarily due 
to the presence of moribund Ss in the 
group. Harlow and Moon (1956) 
found no differences in locomotor 
activity in monkeys after a cumula- 
tive dose of 300 r., but a dose of 900 r. 
significantly depressed activity. The 
measurements were taken for 20 
minutes in a special cage equipped 
with photoelectric cells in which 
movement interrupted two intersect- 
ing beams. McDowell, Davis, and 
Steele (1956) visually observed sev- 
eral types of activities in the home 
cages of monkeys before and after 
400-r. WBR. The treatment pro- 
duced a significant decrement in 
general activity during Postirradia- 
tion Days 13 through 16, decrements 
in the manipulation of cage parts, in- 
crements in self-care and grooming, 
and decreases in aggression toward 
other animals measured in terms of 
hitting, biting, grabbing, and vocal- 
ization. McDowell and Brown 
(1958b) also observed in monkeys 
decrements in locomotion and ob- 
ject-directed activity and an increase 
in self-care following exposure to a 
nuclear detonation when the dose 
varied from approximately 544—709 
rem. Thirty to 60 days later only 
self-care differentiated the irradiated 
from control Ss (McDowell & Brown, 
1958c). 


Other Performance Changes 


Fields (1957) found that 360 r. de- 
pressed runway speed for approxi- 
mately 5 days after irradiation. 
However, he did not control for 
differences in drive. Koch (1958) also 
reported decrements in maze running 
time after a 500-r. X-ray dose. Again 
drive factors were not considered. In 
a follow-up experiment Koch and 
Klemm (1960) again found that 500 
r. depressed running speed, but a 
300-r. group did not differ from the 
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controls. However, when moribund 
animals were excluded from the 500- 
r. group, the difference between con- 
trol and irradiated Ss was not signifi- 
cant. The authors thus emphasized 
a point which is frequently over- 
looked in studies in which irradiated 
animals are tested. Daily 1-hour 
exposures to X rays at the rate of 
50 r. per hour or higher depresses bar 
pressing in a Skinner box after a 
cumulative dose of 400-500 r. 
(Brown, Overall, Logie, & Wicker, 
1960). The depression in perform- 
ance may be associated either with 
changes in drive, muscular strength, 
or radiation-induced aversive condi- 
tioning. 

Schwartzbaum, Hunt, Davies, and 
Kimeldorf (1958) found lowered 
thresholds for electroconvulsive seiz- 
ures in rats following exposure to a 
single 500-r. dose WBR. While de- 
creased food intake may have con- 
tributed to the change, the duration 
and magnitude of the effect was 
greater than that expected from the 
loss in body weight. 

Miller (1962) reported that the in- 
cidence of audiogenic seizures in mice 
is enhanced by exposure to very low 
levels of gamma radiation in the range 
of .14-2 r. during the first 30 days of 
life. Even an increase in the level of 
radioactive fallout contributed to the 
heightened susceptibility following 
irradiation. Tacker and Furchtgott 
(1962) attempted to replicate Miller’s 
findings and he was unable to pro- 
duce increases in seizure susceptibil- 
ity using a dose of 10 r. The basis for 
the discrepancy in the results be- 
tween the two experiments is not 
apparent. 

The lowered distractibility of ir- 
radiated monkeys was previously 
discussed in the section on learning. 
In the same vein, McDowell and 
Brown (1962) tested monkeys under 
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conditions of repetitive work. Each S$ 
was given 50 trials a day or until 
balking occurred for 44 days on a 
single discrimination task. A signifi- 
cantly larger proportion of control 
than irradiated Ss manifested balks. 
"These results support the hypothesis 
of lowered distractibility or tendency 
to explore in irradiated animals. 


Human Studies 


Pape and Riedl (1956) and Klim- 
ková-Deutschová and  Odvárková 
(1958) found electrocutaneous 
changes, the former after 2-3 r., while 
the latter reported only that the 
changes were apparent before any 
atrophy of the skin was detectable. 
Both groups reported qualitative 
changes in conductance without any 
statistical analyses of the data. Since 
several psychologists have assumed 
that skin conductance is one of the 
best indices of activation level, and 
performance in turn is directly re- 
lated to activation level up to a cer- 
tain point (Hebb, 1955), one may 
speculate whether there is any altera- 
tion in performance accompanying 
the changes in skin conductance. Of 
course, it is possible that the magni- 
tude of the skin conductance change, 
or autonomic nervous system change 
which Pape and Riedl associate with 
the irradiation, is insufficient to pro- 
duce any major changes in perform- 
ance. 

Soviet investigators have been con- 
cerned about the efficiency of individ- 
uals working with ionizing radiations. 
Danilin, Lukash, Malinovskaya, 
Skvirskaya, Serebryannikov, and 
Shoshina (1960) measured chronaxies 
in persons working with radioactive 
substances (N—47) and compared 
their responses with a group of 17 
control Ss. Both the chronaxie as 
well as the rheobase for the common 
flexors and extensors of the finger 
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were longer in the irradiated than in 
the control group. The optic chron- 
axie (the method was not described) 
was also significantly higher in 38 
irradiated than in 12 control 5s. But 
a later study (Skvirskaya, 1961) re- 
ported that 92 roentgenologists and 
roentgen technicians who had worked 
in the field from 3. to over 30 years, 
with a majority of the group for over 
20 years, did not show any deleterious 
changes in their nervous system 
which would interfere with normal 
activity. 


Discussion and Conclusions 


Drive as a major determiner of 
performance is influenced by a host 
of internal homeostatic mechanisms, 
neural as well as nonneural. To the 
extent that ionizing radiations affect 
the latter, performance may be al- 
tered. Food and water intake are al- 
tered by radiations, and hunger and 
thrist are the most commonly used 
drives in psychological studies. 

Decrements in general activity 
also will affect many types of be- 
havioral tests which use speed as one 
of the parameters of performance. 
Decreases in exploration can also be 
subsumed under this deficit. 

A good illustration of nonneural 
factors affecting behavior is provided 
by the studies in which ionizing radia- 
tions serve as a nonspecific drive 
stimulus inducing aversive behavior. 
The mechanism of stimulation has 
not been established, but the be- 
havior patterns have been described 
by several groups of independent in- 
vestigators. A number of parameters 
of this phenomenon have been stud- 
ied. To test the generality of the 
drive properties of radiation it would 
be interesting to see whether animals 
could acquire responses other than 
food aversions and spatial avoidance 
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in order to terminate radiation ex- 


posure. 

Finally, it should be noted that 
with weak drives or incentives gen- 
eral malaise, weakness, and other 
clinical radiation symptoms may 
adversely affect performance. 


Sensory FUNCTIONS 
Vision 

Visibility of X rays. There has 
been renewed interest in the visi- 
bility of X rays (Fehér, Sasisc & 
Valenta, 1956; Gurtovoi & Burdian- 
skaia, 1959, 1960a, 1960b; Tagüena, 
1957; Teissler & Vyskočil, 1957). The 
threshold for the appearance of phos- 
phenes is around 1 mr. for normal 
adults, according to Gurtovoi and 
Burdianskaia (1960a). In a neat ex- 
periment the same authors (1960b) 
demonstrated that the phenomenon 
is due primarily to changes in the 
rods and that the contribution of the 
cones is minimal. Baylor and Smith 
(1958) reported that water fleas ex- 
hibit positive geotaxis when exposed 
to X rays. The threshold for the re- 
sponse is between 160-180 r. 

Other investigators measured the 
electroretinogram (ERG) in response 
to X rays. Eleniüs and Sysimetsä 
(1957) observed b-waves resulting 
from a .2-second duration stimulus 
which delivered 4 r. at the level of the 
eye. The amplitude of the wave was 
similar to that of a .2-lux-light stimu- 
lus. In senile cataract cases the 
effective dose was less than 1 r. 
Bachofer and Wittry (1961) meas- 
ured ERGs from the grass frog retina. 
To match the amplitude of an .08- 
second 5.5-meter-candle light stimu- 
lus required a 6.48-r. X-ray dose. 
Further increases in the dose did not 
produce increased amplitudes of the 
b-wave. The authors concluded that 
the magnitude of stimulus necessary 
to evoke b-waves covered a 12,000- 


fold range for light but only a 25-fold 
range for X rays. The range of laten- 
cies in response to light was greater 
than in response to X rays. 

Visual Functions in Irradiated Or- 
ganisms, Avakian (1958) reported 
that the ERG b-wave is altered by 
X irradiation. Isolated whole frog 
eyes or retinal preparations were ex« 
posed to 12-132 r. At the lower dose 
levels the b-wave of the irradiated — 
preparations first exhibited a rise 10- — 
15 minutes after treatment and then a 
rapid drop. With large doses the 
initial rise was absent, and the drop 
actually produced a negative phase. 
Demirchoglyan, Adunts, and Avak- 
yan (1957) also produced depressions 
of ERG by immersing frog retinas in 
a solution of Na3H P*O,. 

In an intriguing series of experi- 
ments Motokawa, Kohata, Komatsu, 
Chichibu, Koga, and Kasai (1957) 
reported that human electric phos- 
phene thresholds are elevated by a 
dose of 1 mr., and by extrapolating 
their data they concluded that a .4- 
mr. dose should also be detectable. 
The effect lasted for several days. In 
addition, individuals working with 
ionizing radiations exhibited a 
chronic elevation of this 
even though their leucocyte counts 
were normal, Within the dose range 
of 1-50 mr. there was an approxi- 
mate log relationship between the 
phosphene effect and the applied 
dose. The authors recommended 
use of this technique for the detection 
of low level exposures. In another 
study Motokawa, Umetsu, Kobaya- 
shi, and Kameyama (1956) reported 
that the threshold for the electric 
flicker phosphene (the stimulus is a 
20-cps sine wave and S presses a key 
as soon as he perceives a flicker) is 
raised by X irradiation. The minimal 
effective dose varied from .1 to .4 r. — 
The latency for the phenomenon was 
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a few minutes, and the maximum 
effects were noticeable 20-30 min- 
utes after irradiation of the retina. 
These effects disappeared after ap- 
proximately 50 minutes. Irradiation 
of the body at places other than the 
retina did not alter the threshold, ex- 
cept in cases where there was scatter 
of the X rayson the retina. (This was 
detected by means of a roentgen film 
placed on the eye.) 

These findings of Motokawa are 
very remarkable. However, a replica- 
tion of these results by other investi- 
gators would be desirable. Riggs, 
Cornsweet, and Lewis (1957), in at- 
tempting to replicate other studies 
of Motokawa on the enhancement of 
phosphenes by light, followed closely 
Motokawa’s procedure; and they 
noted that his technique does not 
produce very reliable thresholds. 
Lipetz (1955) found that X irradia- 
tion of the dark-adapted retina had 
effects similar to exposure to light; 
ie., the threshold for a subsequent 
light stimulus was raised. He used 
doses of 14-30 r. The effects were re- 
versible just as are adapting light 
effects. 

Brown and McDowell (1960) ob- 
served visual acuity deficits in mon- 
keys 3 years after exposure to whole- 
body mixed neutron and gamma radi- 
ation. The Ss had to differentiate a 
broken from a closed circle. With 
breaks up to 2 degrees irradiated Ss 
did not differ from controls, but with 
a 1-degree stimulus, Ss which had 
been exposed to 616 or 308 rep were 
inferior to controls and to Ss exposed 
to 154 or 77 rep. The data show a 
great deal of variability and a repli- 
cation of the experiment would seem 
to be called for. In another study us- 
ing the same type of stimuli these 
authors (McDowell & Brown, 1960b) 
compared nine control monkeys with 
four Ss which 2 years prior to the ex- 
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periment had received 6,000 r. fo- 
cally to the posterior cortical associa- 
tion area and two Ss with focal irra- 
diation of the frontal association 
areas. On all problems the frontal 
group was inferior to the controls, 
while the posterior group differed 
from the controls only on problems 
where the separation was 7 degrees or 
less. The greater deficit in frontal 
than in posterior animals would seem 
to indicate that functions other than 
vision contribute to, or are perhaps 
mainly responsible for, the observed 
phenomenon. 

In contrast to these deleterious 
effects of irradiation, Dawson and 
Smith (1959) observed that the stim- 
ulus threshold for the dark-adapted 
lateral eye of the horseshoe crab is 
lowered by X-ray doses from 5 to 175 
r. The effect was apparently stable 
and could be elicited as long as 6 
hours after treatment. The basis for 
the discrepancy between this experi- 
ment and those reviewed previously 
is not apparent, unless we assume a 
species effect which is not too satis- 
factory an explanation. 

Summary. It has been shown that 
low doses of X rays are visible. At 
the same time, both animal as well as 
human studies have indicated that 
visual functions, especially scotopic, 
are depressed by ionizing radiations. 
It is not too far-fetched to assume 
that the photochemical processes in 
the receptors are affected. Lipetz 
(1955) found that pithed frogs gave 
the same responses to X rays as did 
animals with an intact CNS indicat- 
ing that the changes are in the retina 
itself. 

The individual studies have dealt 
with a variety of visual functions, but 
none has been investigated thor- 
oughly. Most of the studies ought to 
be replicated before arriving at defi- 
nitive conclusions about the effect of 


EFFECTS OF IONIZING RADIATIONS 


ionizing radiations on visual func- 
tions. 
Other Sensory Modalities 


Studies of sensory modalities other 
than vision have been very few. In 
most cases only scattered individual 
experiments or clinical data are 
available. 

Cutaneous and Internal Sensitivity. 
Delitsyna (1957) applied a single dose 
of 500-5,000 r. to the leg or foot of 
rabbits, and then she tested parietal 
cortical action potentials in response 
to stroking the skin of the irradiated 
area. After 500 r. the potentials were 
exaggerated for 3 days with some 
effects apparent after 7 days. The 
intensification was apparent even 
when areas adjacent to the locus of 
irradiation were stimulated. With a 
5,000-r. dose, on the other hand, the 
responses were weaker than normal 
and this was apparent 10-15 minutes 
after treatment. Two days after the 
treatment the reaction increased 
markedly, but for the following 4 
weeks the responses to touch were 
very unstable. At times they were 
either weak or completely absent. 
The same investigator (Delitsyna, 
1959) also studied EEG responses to 
skin stimuli in human patients re- 
ceiving therapeutic X irradiation of 
various parts of the body. The results 
were presented only in a summary 
form, but Delitsyna stated that her 
human and animal data were com- 
parable, except that the former were 
less pronounced and more unstable. 

Delitsyna (1957) also studied in 
cats neural discharges from the 
splanchnic nerve and from its ab- 
dominal branches in response to in- 
teroceptive stimulation. Only a sum- 
mary of the findings is presented, in- 
dicating that the effects of X irradia- 
tion depend on the preirradiation 
level of discharge and on the time 


after exposure that the measurements 
are taken. The details of the findings 
are not too clear, but in general it 
would seem that 500 r. of X irradia- 
tion produces a temporary hyper- 
activity of afferent discharges. The 
cortical potentials in response to in- 
teroceptive stimulation were also in- 
vestigated in rabbits which had re- 
ceived 1,000 r. of X irradiation. 
Again only transitory changes ap- 
peared and, "the most distinct pic- 
ture of trans-limit inhibition is ob- 
served 1-2 days before the death of 
the animal" (p. 57). The changes 
were ascribed to hyperactivity of 
subcortical vegetative centers, Liva- 
nov (1957) also states that for the 
first 48-72 hours after irradiation the 
afferent input from the somesthetic 
receptors is enhanced, but this is 
then followed by an inhibition. 
Stimuli which were ineffective prior 
to irradiation “overpowered the cor- 
tex." 

In summary we may say that So- 
viet investigators have reported al- 
tered cutaneous and interoceptive 
excitability in irradiated organisms. 
The locus of the changes, whether in 
the receptor, the afferent nerve, or in 
the CNS is not clearly specified. 
Komarov (1957) who studied blood 
pressure and respiration in response 
to stimulations of mechanoreceptors 
in irradiated cats stated that changes 
occurred in receptors, in nerves as 
well as in effectors. 

Smell and Taste. Kalmus and 
Farnsworth (1959) recorded subjec- 
tive and objective changes in tastein 
a person who received radiation to 
the posterior area of the tongue and 
the oral pharynx. There was a tem- 
porary loss of sensitivity to NaCL, 
sucrose, PTC, and quinine, but not 
to HCl. They speculated that the 
nerves rather than the receptors were 
the source of the deficit. There was a 
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gradual recovery of the sensitivity. 
Olfactory perception was not affect- 
ed. Loss of taste sensitivity for all 
foods was also reported by a patient 
who had received 6,870 r. to the base 
of the tongue and 5,000 r. to the parot- 
id and submaxillary glands (Frisby, 
1961). Kuznetsova (1960) tested 
100 Ss between the ages of 29-40 who 
had worked on a betatron for 4-10 
years. In 43% of the cases there was 
a decrease in taste sensitivity, espe- 
cially to bitter stimuli. Olfactory 
thresholds were also higher for all 
stimuli tested. Koznova (1957), on 
the other hand, found that patients 
subjected to therapeutic irradiations 
show hypersensitivity and perverted 
smell, olfactory hallucinations and 
shortened adaptation times. She 
believed that the effects were of 
central origin. Grigor'ev (1958, p. 29) 
reported that in a study oí 156 
patients who had received radiation 
therapy, 32 reported metallic after- 
tastes and 30 olfactory hallucinations 
after 2,000-4,000 r. of local and 50- 
75 r. of whole-body irradiation. It 
should be noted, however, that other 
symptoms of radiation malaise, such 
as nausea, general weakness, head- 
ache, etc., were also present. 

In summary, then, it is apparent 
that exposures to ionizing radiations 
may produce temporary decrements 
in taste sensitivity. The minimal 
effective dose or other functional rela- 
tionships have not been determined, 
Various types of olfactory disturb- 
ances, including changes in thresh- 
olds and adaptation time, and hal- 
lucinations, have also been reported. 
The role of the CNS in these phe- 
nomena has not been established, al- 
though the presence of hallucinations 
would seem to imply cortical activity. 
Since visceral as well as olfactory 
functions have been ascribed to parts 
of the rhinencephalon (Pribram & 
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Kruger, 1953) and, apparently, the 
ANS is the most radiosensitive por- 
tion of the nervous system, it is not 
too surprising that olfactory halluci- 
nations may occur after large doses 
of irradiation. The minor importance 
of the chemical senses in human ecol- 
ogy is not conducive to much re- 
search in these modalities. 

Vestibular Functions and Hearing. 
Moskovskaya (1959) studied 35 pa- 
tients 35-60 years old undergoing X- 
ray therapy. With a dose of less than 
2,500 r. the following symptoms were 
observed in at least 20% of the cases: 
unsteadiness of gait, weak spontane- 
ous divergence of both hands, and a 
prolongation of postrotational ny- 
stagmus from an average of 25-60 
seconds to 50-160 seconds. The 
duration of the latent period in 
caloric tests decreased from 30-45 to 
20-35 seconds, and the duration of 
the nystagmus increased up to 85 
seconds. Two weeks after the cessa- 
tion of irradiation, no marked sub- 
jective symptoms were reported, ex- 
cept for a slight general weakness. 
Another group of 15 patients 40-60 
years old who had received 5,000 to 
22,000 r. still showed some abnor- 
malities on rotation and caloric tests 
5-7 years following the termination 
of treatment. The author, however, 
did not present any data for this 
group. Sheattributed her findings to 
a weakened inhibitory action of the 
cortex. 

Kozlov (1958) exposed 126 guinea 
pigs to 350-r. WBR, which is near the 
LDso level. In 85% of the Ss the 
auricular reflexes were disturbed. 
Cochlear microphonics were then 
tested at 8 frequencies from 500 to 
8,000 cps. Compared to controls the 
irradiated Ss showed increased thresh- 
olds at all frequencies varying from 
2 to 25 db. Histological examination 
revealed hemorrhages and morpho- 
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logical changes in the middle as well such as amnesia, paralysis, blindness, 
as in the inner ear. ete. A recent study of a scientist who 

Murphy and Harris (1961), on the was subjected to therapeutic radia- 


other hand, trained rats to press a 
lever or cross over in a shuttlebox 
whenever a 4,000 cps tone was modu- 
lated sinusoidally either 4 db. in in- 
tensity or 4% in frequency, finding 
that 500 r. to the rear half of the skull 
had no effects on hearing. Neither 
were there any effects on the audio- 
gram as measured by the Pryer pinna 
reflex or upon the 1-volt cochleogram. 

In summary the two studies on 
hearing which were reviewed present 
contradictory data, and both in turn 
differ from Girden and Culler's study 
(Girden, 1935), in which irradiated 
dogs showed increased auditory sen- 
sitivity. The results of the latter 
study bear some resemblance to 
Moskovskaya's (1959) finding of in- 
creased vestibular sensitivity. It 
would seem that more data are 
necessary in this area before defini- 
tive conclusions can be drawn. 


Human ADULT CLINICAL STUDIES 


Animal studies are obviously in- 
adequate for the assessment of com- 
plex behavioral functions; and for 
practical purposes it would be helpful 
if the animal data on “simpler” func- 
tions could be validated on the 
human level. | However, the be- 
havioral studies of individuals ex- 
posed to ionizing radiations pose a 
number of difficulties. 

We are only too aware of the large 
variability in all facets of human be- 
havior. To assess radiation-induced 
changes it is, therefore, necessary to 
have pre-exposure measures, which 
are seldom available in cases of un- 
planned exposures, such as fall-out or 
industrial accidents. Postexposure 
assessment of such individuals can 
produce meaningful data only when 
gross qualitative aberrations appear, 


tions (Frisby, 1961) provides a good 
illustration of this problem. No data 
were available on his pre-exposure 
performance. On the postexposure 
tests he scored well above the norms 
for British adults. However, it was 
impossible to determine whether he 
suffered any loss or not. 

Planned exposures usually occur in 
therapeutic situations where the pa- 
tient's illness may interfere either 
directly or indirectly, through the 
anxiety which illness engenders or 
the prospects of having to und 
radiation treatments, with a 
assessment of many behavioral func- 
tions. 

Similarly, postexposure measures 
may be contaminated by anxiety or a 
lack of motivation brought forth by 
beliefs about the harmful effects of 
radiation. Hasterlik and Marinelli 
(1956), in reporting on the clinical 
manifestations in an individual ex- 
posed to 70 rem, attributed several 
of the clinical symptoms to anxiety 
and tension rather than to “radiation 
sickness" per se. 

Detailed behavioral studies of ex- 
posed individuals are very scant, and 
for all practical purposes nonexistent, 
except for some Soviet work. Since 
the studies on sensory functions were 
reviewed in a previous section, they 
will be omitted from further consid- 
eration. 


Four individuals exposed to 12-190 


rem of mixed gamma and neutron 


(mostly gamma) radiations were 
tested on the Halstead battery, pur- 
porting to measure frontal lobe func- 
tions (Halstead, 1947), 3 days after 
exposure and again 1 year later. One 
victim was also retested 2 years after 
exposure. On all of the occasions all 
of the scores were within the normal 
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range (Hasterlik & Marinelli, 1956). 
Again, it should be emphasized, how- 
ever, that this testing procedure could 
not have detected anything but gross 
damage. 

A-bomb victims at Hiroshima (Tsu- 
iki et al, 1958) and at Nagasaki 
(Konuma, Furutani, & Kubo, 1957) 
had a large number of various soma- 
tic complaints such as dizziness, sleep 
disturbances, irritability, forgetful- 
ness, headaches, etc. Whether these 
are chronic radiation effects or neu- 
roses is difficult to ascertain. 

Jammet, Mathe, Pendic, Duplan, 
Maupin, Latarjet, Kalic, Schwarzen- 
berg, Djukic, and Vigne (1959) re- 
ported on the clinical manifestations 
of six Jugoslavs who were exposed to 
approximately 300-1,200 rem of 
mixed neutron and gamma radia- 
tions. On Day 1 the following symp- 
toms were present: asthenia, depres- 
sion, anorexia, nausea, paresthesias 
of the upper extremities, and sweat- 
ing. Between Weeks 4 and 7 the vic- 
tims exhibited general confusion and 
anorexia and sweating reappeared. 
The authors did not elucidate further 
on the mental symptoms. 

Sayenko-Lyubarskaya (1959) re- 
viewed the Soviet literature on neural 
changes in man. Several studies in- 
dicated altered autonomic activity in 
the course of therapeutic irradiations, 
Frequently reported symptoms in- 
cluded: anorexia, apathy, irritability, 
depression, aversion to intellectual 
effort, insomnia or somnolence, as- 
thenia, headache, and impairment of 
memory. The recovery following the 
termination of exposures is fre- 
quently slow, lasting for several 
months. 

Another review by  Grigor'ev 
(1958) on the effect of radiation on 
the human nervous system covered 
many of the same studies. In addi- 
tion, however, the author presented 
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several sets of data collected either by 
himself or in collaboration with other 
investigators. In one group of 156 
patients who had received radiation 
therapy of 25-200 r. delivered sys- 
temically or 250-6,000 r. locally, 
Grigor'ev, Domshlak, and Daren- 
skaya observed the following symp- 
toms among others: nausea, general 
weakness, headache, somnolence, ver- 
tigo, anorexia, vomiting, salivary 
changes, insomnia, pains in the ab- 
domen and region of the heart, thirst, 
and olfactory and gustatory disturb- 
ances. Again it is noteworthy that 
functionally opposite reactions such 
as insomnia and somnolence or dry- 
ness of the mouth and increased sali- 
vation were both noted. Clinical and 
EEG data were also collected from 21 
individuals, including 3 hydrocepha- 
lics, 5 tumor cases, and 13 postopera- 
tive patients who were receiving pro- 
phylactic treatments following either 
local or systemic irradiations. The 
cumulative dose, distributed over 
several months in some cases, varied 
from 300 to 4,800 r. The initial 
clinical symptoms after local irradia- 
tion to the head did not differ from 
the symptoms which were reported 
after local irradiation of the thorax, 
abdomen, or extremities—or after 
systemic irradiation, the most fre- 
quently reported ones being nausea, 
weakness, somnolence or insomnia, 
pains in various parts in the body— 
including the head—and reduction of 
appetite. In most patients after the 
initial irradiation the thresholds of 
cortical excitability decreased for a 
few hours or days, but this was fol- 
lowed by a return to the normal level 
or even by decreased excitability. 
This was frequently noted by intensi- 
fication of the alpha waves. 

It is interesting to note that the 
CNS changes were the same regard- 
less of the locus of irradiation. It 
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would seem, therefore, that irradia- 
tion produces some general nonlocal- 
ized humoral changes, which have an 
influence on the autonomic nervous 
system and also directly or second- 
arily through the autonomic nervous 
system on the CNS. Although 
Grigor'ev (1958) considers this possi- 
bility, he still assigns the primary role 
to the cortex. 


We do not exclude the significance of the 
humoral component. However, under these 
conditions we start from the point of view that 
the cerebral cortex unites in its activity all the 
links of the regulatory mechanisms—the 
various divisions of nervous system 
(animal, i.e., non-autonomic and vegetative) 
and all forms of humoral regulation (p. 114). 


The CNS changes were phasic in 
most instances, with hyperreactivity 
followed by hyporeactivity. In gen- 
eral the EEG data were quite similar 
to the experimental findings of Soviet 
researchers with animal Ss. In con- 
trast to some reports of animal stud- 
ies, Grigor'ev could not find any 
differences in the reactions of the or- 
ganism as a function of the initial 
state of the cortical processes. It 
should be noted, nevertheless, that 
his sample was small; the preirradia- 
tion debilities were diverse; and the 
reliability of the classification of the 
cortical processes did not seem to be 
very high. 

The Ss were followed in some cases 
as long as 2 years. There are inci- 
dental comments that some of the 
changes could be detected as long as 
5 months after treatment, but there 
is no systematic presentation of any 
long-term effects. The latter would 
seem to be an important area of in- 
vestigation. Dihlmann (1960), 
among others, assumed there are 
latent morphological changes result- 
ing even from doses below the gener- 
ally accepted tolerance levels and 
that there is probably no dose that 


could not produce late changes. He 
also stated there is no relationship be- 
tween dose level and latency of ap- 
pearance of morphological changes. 

A study of the Marshallese 5 years 
after the fall-out in which they were 
exposed to 69-175 r. of gamma radia» 
tion and received superficial beta 
burns revealed no obvious outward 
signs of aging (Conard, 1960). A 
variety of tests—visual acuity, motor 
reaction time, and a measure of hand 
strength—were also given to 84 unex- 
posed and 42 exposed people. None 
of these tests showed any differences 
between the exposed and unexposed 
individuals. 


Summary 

The acute phases of radiation ill- 
ness include a variety of behavioral 
symptoms and in cases of localized 
irradiation, where the general radia- 
tion syndrome is not present in its 
entirety, certain behavioral disturb- 
ances such as headaches, loss of ap- 
petite, localized pain, etc., have been 
reported. No data are available on 
long-term effects which would also 
test the radiation-induced aging hy- 
pothesis (Upton, 1957). The difficul- 
ties in assessing behavioral effects in 
people were discussed. 


CoNCLUSIONS AND GENERAL 
SUMMARY 


The past 6 years have witnessed a 
huge increase in studies of behavioral 
effects of ionizing radiations. The 
increasing uses of atomic energy in 
military and civilian affairs which 
willy-nilly affect most human beings 
(nuclear fall-out, to mention just one 
example) warrant an examination of 
the biological, including behavioral, 
effects of radiation. 

Somatic effects of radiation depend 
on the state of development at which 
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the organism is exposed. So far no 
good evidence is available for mor- 
phological changes in the nervous sys- 
tem in organisms irradiated in adult- 
hood. On the other hand, exposure of 
mammalian fetuses even to a rela- 
tively small dose of radiation pro- 
duces permanent neural damage. It 
is, therefore, necessary to consider 
separately the effects on developing 
organisms and those observed after 
exposure of adults. 

Several investigators both in this 
country as well as abroad, notably in 
the Soviet Union, have shown that 
mammals exposed in utero or soon 
after birth exhibit behavioral changes 
when tested after birth and these 
changes seem to persist throughout 
the animal'slife-span. Irradiated ani- 
mals are deficient in learning, except 
for avoidance conditioning, in various 
types of motor skills and in mating 
activity. They also exhibit hyper- 
emotionality and hypo- or hyperac- 
tivity. Mental deficiency has also 
been observed in prenatally exposed 
children. 

While a start has been made in de- 
termining changes in several behav- 
ioral domains, others, such as percep- 
tion, have not been studied as yet. 
Information about the relationship 
between age at irradiation and subse- 
quent behavioral deficits is incom- 
plete thus far. Nor do we have data 
on the interrelationships of the ob- 
served changes. 

The behavioral data are in accord 
with anatomical findings, although 
there have been no studies as yet in 
which specific behavioral changes 
were correlated with specific morpho- 
logical aberrations. The latter ap- 
proach would seem to hold promise as 
a powerful technique for the analysis 
of the structural correlates of be- 
havioral development. ^ Radiation 
techniques make it feasible, more 


readily than do other techniques, to 
produce congenital aberrations; and 
our understanding of a number of be- 
havioral functions, color vision to 
name just one example, has greatly 
benefited from the study of organisms 
with congenital defects. 

In contrast to the relative unanim- 
ity in assessing the effects of radia- 
tions in developing organisms, the 
results of the analysis of the neural as 
well as behavioral changes in adult 
irradiated organisms are quite contro- 
versial. Classically it has been main- 
tained that morphologically the CNS 
is relatively radioresistant. Recent 
studies by Western researchers fur- 
ther support this hypothesis. Soviet 
scientists, in the other hand, seem to 
be divided on this issue. While most 
of the data indicate high tolerance of 
neural tissues, some experimenters 
report changes after 50 r. 

A much greater divergence of 
opinion between Western and Soviet 
scientists centers around the physio- 
logical, mainly EEG, changes follow- 
ing radiations. A number of Soviet 
workers claim that a dose of 1 r. or 
even less is sufficient for the produc- 
tion of distinct changes, consisting 
initially of neural hyperexcitability 
which is soon followed by hypoexcita- 
bility. The duration of this change 
depends on the dose. Western and 
some Soviet researchers, on the other 
hand, state that no physiological 
changes are discernible with doses be- 
low several hundred r. The effective 
dose is apparently lower for subcorti- 
cal than cortical structures. 

Of greater interest to the psychol- 
ogist are the Soviet reports that con- 
ditioning is also a very sensitive index 
of irradiation. Here again, it is 
claimed that relatively small doses of 
approximately 1 r, can alter certain 
aspects of the CR, mainly the process 
of internal inhibition. The changes 
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depend on the "type of organism.” 
Again, proper assessment of the re- 
ports is difficult. So far there has been 
no Western replication of the Soviet 
findings. To this reviewer it would 
seem that the report changes reflect 
primarily nonassociative factors in 
conditioning. 

American studies with various spe- 
cies have for the most part failed to 
show deficits in a variety of learning 
situations even after doses of several 
thousand r. Indeed, the irradiated Ss 
seem to be less distractible, or they 
have a narrower range of attention 
and are, therefore, frequently super- 
ior to control animals in the learning 
of many tasks. 

Different kinds of performance in 
contradistinction to learning are ad- 
versely affected by radiations. Since 
performance is affected by internal 
hormonal and metabolic changes and 
these are very sensitive to ionizing 
radiations, it should not be surprising 
to find that performance is altered 
following exposure to radiations. 
Food and water deprivation are 
among the most commonly used 
drives in psychological studies, and 
both food and water intake are af- 
fected by radiations. Data are also 
available which show radiation-in- 
duced decrements in general and in 
object directed activity, in range of 
attention and in distractibility. All of 
these measures probably reflect a 
lowered responsivity to changes in 
stimulation. 

An interesting and potentially use- 
ful finding has been that animals tend 
to avoid stimuli associated with radi- 
ation. Thus these stimuli may be 
tested as a novel kind of UCS which 
apparently does not stimulate di- 
rectly any of the external receptors. 
Radiations might, therefore, serve as 
an internal UCS of which the organ- 
ism is not aware. The effective dose 
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for this phenomenon is apparently 
well below the level which produces 
serious pathological effects. 

Certain reported sensory changes 
might also be associated with meta- 
bolic alterations. Visual functions, 
especially scotopic, are affected by 
radiations; and the Japanese physi- 
ologist, Motokawa, claims that the 
effective dose for this phenomena is in 
the milliroentgen range. The findings 
in this field are not unequivocal. 
While most studies report adverse 
effects, in one EEG study sensitivity 
actually increased following X irradi- 
ation. Since different investigators 
studied different aspects of the visual 
process, the data are not comparable. 
Obviously, more work needs to be 
done in this field, and this ought to 
include an analysis of the mechanism 
or locus of the changes. 

Radiation-induced changes have 
also been reported for hearing, taste, 
and smell; and here again the findings 
have been equivocal, including re- 
ports of enhanced, lowered, or un- 
affected sensitivity. Again, more re- 
search ought to be done here; and 
since metabolic processes probably 
play an important role in the receptor 
processes of all three functions, radia- 
tion might affect them independently 
of any neural changes. Both physio- 
logical as well as behavioral data are 
needed, since it is well known that at 
times action potentials from the sen- 
sory neurons would lead one to be- 
lieve that the animal can make dis- 
criminations; yet there is no be- 
havioral evidence for such discrimina- 
tions (Hess, 1960, p. 74). 

A variety of behavioral symptoms 
occurs in the acute phases of radia- 
tion illness. On the human level only 
qualitative data are available. It 
must be realized, of course, that the 
assessment of changes in various be- 
havioral functions on the adult level 
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poses a number of difficulties. Yet for 
practical reasons it is essential to have 
both short- as well as long-term quan- 
titative human data on the effects of 
large as well as small amounts of 
radiation (United Nations, 1958, p. 
29). 

It has been assumed that radiation 
induces or mimics aging. On the be- 
havioral level there are only limited 
data bearing on this hypothesis. If 
further research should support the 
hypothesis, we would have available 
a useful tool for the experimental 
analysis of behavioral age changes. 
In the laboratory age changes could 
be hastened and thus made more 
amenable to investigation, 

The very important problem of 
genetic effects was not discussed since 
no behavioral data have been avail- 
able. This area of study also presents 
great difficulties for the experimenter, 
butasthe United Nations Committee 
(1958, pp. 34-35) has recommended, 
work should be initiated on the 
genetic consequences of radiation 
which needs to include behavioral 
studies. Omitted from consideration 
were also the applications of radioiso- 
topes in the analysis of neural corre- 
lates of behavioral processes (cf. 
Rosenzweig, Krech, & Bennett, 1957) 
or the use of radiations for the pro- 
duction of circumscribed neural le- 
sions (Malis, Loevinger, Kruger, & 
Rose, 1957). 

Finally, to repeat a theme touched 
upon several times in this review, 
radiations are potentially useful as 
research tools in the analysis of di- 
verse behavioral problems. 


REFERENCES 


ALEXANDROVSKAYA, M. M. [Some morpho- 
logical and  histopathological changes 
occuring in the central nervous system of 
white rats irradiated during the antenatal 
period.] Med. Radiol., 1959, 4(11), 10-14: 


ERNEST FURCHTGOTT 


ANDERSON, E. E. The interrelationship of 
drives in the male rat: II. Interrelations 
among measures of emotional, sexual, and 
exploratory behavior. J. genet. Psychol, 
1938, 53, 335-352. 

Anprews, H. L., & CAMERON, L. M. Radia- 
tion avoidance in the mouse. Proc. Soc. 
Exp. Biol. Med., 1960, 103, 565-567. 

Arpit, J. Avoidance conditioning through 
irradiation: A note on physiological 
mechanisms and psychological implications. 
Psychol. Rev., 1958, 65, 167-169. 

AVAKIAN, Ts. M. [Disturbance of the func- 
tions of the retina of the eye under the 
action of weak irradiation.] Biofisika, 1958, 
3, 114-116. 

Bacuorer, C. S., & Wittry, S. E. Electro- 
retinogram in response to X-ray stimula- 
tion. Science, 1961, 133, 642-644. 

BaiLEy, O. T., INGRAHAM, F. D., & BERING, 
E. A., JR. The late effects of Tantalum 182 
radiation on the cerebral cortex of monkeys. 
J. Neuropathol. exp. Neurol., 1958, 17, 151- 
157. 


Bašıć, M., & Weser, D. Über intrauterine 
Fruchtschädigung durch Röntgenstrahlen. 
Strahlentherapie, 1956, 99, 628-634. 

Bavron, E. R., & Surrg, F. E. Animal per- 
ception of X-rays. Radiat. Res., 1958, 8, 
466-474. 

BrLokoNski, I. S. [Changes in the higher 
nervous activity of rats during X-irradia- 
tion.] Med. Radiol., 1959, 4(12), 11-16. 

Brain, W. C. The effects of cranial X radia- 
tion on maze acquisition in rats. J. comp. 
physiol. Psychol., 1958, 51, 175-177. 

Brars, W. C., & AnNoLD, W. J. The effects of 
cranial X radiation on retention of maze 
learning in rats. J. comp. physiol. Psychol., 
1956, 49, 525-528. 

BonN, W. Zur Auslósung von Reflexen by 
Schnecken durch Réntgen und Alpha 
Strahlen. Strahlentherapie, 1960, 112, 634- 
636. 


Brapy, J. V. & BuNNELL, B. N. Behavior 
and the nervous system. In R. H. Waters, 
D. A. Rethlingshafer, and W. E. Caldwell 
(Eds.), Principles of comparative psychology. 
a York: McGraw-Hill, 1960. Pp. 355- 

Brent, R. L. The indirect effect of irradiation 
on embryonic development: II. Irradiation 
of the placenta. Amer. J. Dis. Children, 
1960, 100, 103-108. 

Brent, R. L., & McLaucutin, M. M. The 
indirect effect of irradiation on embryonic 
development: I. Irradiation of the mother 


EFFECTS OF IONIZING RADIATIONS 193 


while shielding the embryonic side. Amer. 
J. Dis. Children, 1960, 100, 94-102. 

Brown, W. L., Carr, R. M., & OvrRALL, 
J. E. The effect of chronic whole-body 
irradiation on i cue associations. 
J. gen. Psychol., 1959, 61, 113-119. 

Brown, W. L., & McDowrrr, A. A. Visual 
acuity performance of normal and chronic 
irradiated monkeys. J. genet. Psychol., 
1960, 96, 133-137. 

Brown, W. L., OvERALL, J. E., & Gentry, 
G. V. Absolute versus relational discrimi- 
nation of intermediate size in the rhesus 
monkey. USAF Sch. Aviat. Med. Rep., 
1965, No. 59-12. 

Brown, W. L., OvERALL, J. E., Losie, L. C., 
& Wicker, J. E. Lever-pressing behavior of 
albino rats during prolonged exposure to X- 
radiation. Radiat, Res, 1960, 13, 617- 
631. 

CasrANERA, T. J., Jones, D. C., & KMEL- 
DORF, D. J. The effect of X-irradiation on 
the diffuse activity performance of rats, 
guinea pigs, and hamsters. Brit. J. Radiol., 
1959, 32, 386-389. 

Caster, W. O., REDGATE, E. S, & ARM- 
stronc, W. D. Changes in the central 
nervous system after 700 r. total-body X- 
irradiation. Radiat. Res., 1958, 8, 92-97. 

CHESNOKOVA, A. P. [The study of nervous 
mechanisms of the higher nervous activity 
disturbance of white rats in the early stage 
of ontogenesis in the action of a single dose 
of ionizing radiation.] Med. Radiol., 1959, 
4(4), 16-21. 

CLEMENTE, C. D., YAMAZAKI, J. N., BENNETT, 
L. R., & McFarr, R. A. Brain radiation in 
newborn rats and differential effects of in- 
creased age: II. Microscopic observations. 
Neurology, 1960, 10, 669-675. 

Conarp, R. A. An attempt to quantify some 
clinical criteria of aging. J. Gerontol., 1960, 
15, 358-365. 

Courvitz, C. B., & Epmonpson, H. A. 
Mental deficiency from intrauterine expo- 
sure to radiation. Bull. Los Angeles Neurol. 
Soc. 1958, 23, 11-20. 

Cowen, D., & GELLER, L. M. Long-term 
pathological effects of prenatal X-irradia- 
tion on the central nervous system. J. 
ueber exp. Neurol, 1960, 19, 488- 


DawiLIN, A. A., Luxasn, N. I, MartNov- 


SKAYA, T. Ya, Sxivmskava, K. B. 
SEREBRYANNIKOV, V. D., & SHOSHINA, 
G. A. [The condition of the nervous sys- 
tem of persons working with radioactive 


(HIE Med. Radiol., 1960, $(5), 37- 


Davis, R. D oe in the food pref- 
erences of irradiated monkeys. J. genet. 
Psychol., 1958, 92, 53-59. 

Davis, R. T. Discriminative performance of 
monkeys irradiated with X-rays, Amer. J. 
Psychol., 1961, 74, 86-89. 

Davis, R. T, McDowrnt, A. A, Deters, 
C. N., & Steere, J. P. Performance of 
rhesus monkeys on selected laboratory tasks 
presented before and after a large single 
dose of whole-body X radiation. J. comp. 
physiol., Psychol., 1956, 49, 20-26. 

Davis, R. T., McDowext, A. A., Gronsky, 
M. A., & Steexe, J. P. The performance of 
X-ray irradiated and non-irradiated rhesus 
monkeys before, during, and following 
chronic barbiturate sedation. J. genet. 


receptor. Science, 1959, 129, 1670-1671. 

Decennarvt, K. H, & Grirer, H. J. 
Durch Rontgenstrahlen induzierte Entwick- 
lungsstórungen bei Kaninohenembryonen. 
Z. Naturforsch., 1959, 14, 753-756. 

Detitsyna, N. S. [Concerning some changes 
in receptor systems under the influence of 
X rays.] In, [Works of the All Union Confer- 
ence on Medical Radiology: Experimental 
Medical Radiology.] (Atomic Energy Com- 
mission Tech. Rep. No. 3661, 1959, pp. 
51-60. Translation obtainable from the 
United States Department of Commerce, 
Office of Technical Services.) Moscow: 
Trudy, 1957. 

Deuitsyna, N. S. [Studies of receptor action 
in irradiated areas of humans undergoing 
irradiation therapy.] Med. Radiol, 1959, 
4(8), 73-76. 

Demmrcnoctyan, G. G., Apunts, G. Ts & 
AVAKYAN, Ts. M. [The action of radioactive 
phosphorus on the functional state of the 
retina.) In, [Works of the All Union Con- 
ference on. Medical Radiology: Ex, 

Medical Radiology.] (Atomic Energy Com- 
mission Tech. Rep. No. 3661, 1959, pp. 164- 
169. Translation obtainable from the 
United States Department of Commerce, 
Office of Technical Services.) Moscow: 
Trudy, 1957. 1 

DimtMANN, W. Zur Morphologie, Theorie 
und Problematik der Strahlenspütschüden 
im Zentralnervensystem. Strahlentherapie, 
1960, 112, 567-586. 

DrMascio, A., AzrIN, N. H., FULLER, Lata 
& Jetter, W. The effect of total-body X- 
irradiation on delayed-response perform- 


194 ERNEST FURCHTGOTT 


ance of dogs. J. comp. physiol. Psychol., 
1956, 49, 600-604. 

Evenits, V., & SvsraETSA, E. Measurement 
of the human electroretinographic roentgen 
threshold dose. Acta radiol., 1957, 48, 


465-469. 

Feporova, I. V. (The change in the latent pe- 
riod of the flexor reflex of the skin in rab- 
bits following general irradiation with a 10 
r. dose of X rays.] Med. Radiol, 1958, 
3(2), 32-36. 

Fenér, L., Sastsc, S., & VALENTA, A. [Data 
on the invisibility of X rays.] Orv. Hetil., 
1956, 97, 748-750. 

FiELDs, P. E. The effect of whole-body X 
radiation upon activity drum, straightaway, 
and maze performances of white rats. J. 
comp. physiol, Psychol., 1957, 50, 386-391. 

Frissy, C. B. A note on radiation treatment 
in relation to performance on certain tests. 
Brit. J. Psychol., 1961, $2, 65-70. 

Fritz-Niccut, Hent. Strahlenbiologie: Grund- 
lagen und Ergebnisse. Stuttgart: Thieme, 
1959, 


Furcatcott, E. Behavioral effects of ioniz- 
ing radiations. Psychol. Bull., 1956, 53, 
321-334. 

FuncurGorr, E., & EcHoLs, M. Activity and 
emotionality in pre- and neonatally X- 
irradiated rats. J. comp. physiol. Psychol., 
1958, 51, 541-545. (a) 

Furcutcort, E., & EcHoLs, M. Locomotor 
coordination following pre- and neonatal 
X irradiation. J. comp. physiol. Psychol., 
1958, 51, 292-294. (b) 

Furcntcort, E., EcHoLs, M., & Dees, J. W. 
Motor functions in fetally irradiated rats. 
Paper read at American Psychological 
Association, Chicago, September 1960. 

Furcutcortt, E., Ecnors, M., & OPENSHAW, 
J. W. Maze learning in pre- and neonatally 
X-irradiated rats. J. comp. physiol. 
Psychol., 1958, 51, 178-180. 

FuncHrGOTT, E., MumrPHREE, R. L., Pace, 
H. B., & Dees, J. W. Mating activity in 
fetally irradiated male swine and rats. 
Psychol. Rep., 1959, 5, 545-548. 

Furcutcort, E., & WECHKIN, S. Avoidance 
conditioning as a function of prenatal 
irradiation and age. J. comp. physiol. 
Psychol., 1962, 55, 69-72. 

GaNwcLorr, H., & HarEv, T. J. Veränder- 
ungen der elektrischen Hirntätigkeit by 
Katzen nach Röntgentotalbestrahlung. 
Experientia, 1959, 15, 397-399. 

GANGLOFF, H., & HALEY, T. J. Effects of X- 
irradiation of spontaneous and evoked 
brain electrical activity in cats. Radiat. 

Res., 1960, 12, 694—704. 

Garcia, J., KiwELpomr, D. J., & Hunt, 


E. L. The use of ionizing radiation as a 
motivating stimulus. Psychol. Rev., 1961, 
68, 383-395, 

Garcia, J., Kimecporr, D. J., & KoeLLING, 
R. A. Conditioned aversion to saccharin 
resulting from re to gamma radiation. 
Science, 1955, 122, 157-158. 

Grepen, E. Effect of roentgen rays upon hear- 
ing in dogs. J. comp. Psychol., 1935, 20, 
263-290. 

GORSHELEVA, L. S. The effect of single ex- 
posures to small doses of X-rays on trace 
motor alimentary conditioned reflexes in 
albino rats. Pavlov J. higher nerv. Activ., 
1960, 10, 476-480. 

Granam, T. M., Marks, A., & EnmsHorr, 
B. H. Effects of prenatal X-irradiation on 
discrimination learning in the rat. Proc. 
Soc. Exp. Biol. Med., 1959, 100, 78-81. 

Gricor'ev, Y. G. [Data on the reactions of the 
human central nervous system to ionising 
radiations.) (Atomic Energy Commission 
Tech. Rep. No. 4284, 1960. Translation 
obtainable from the United States Depart- 
ment of Commerce, Office of Technical 
Services.) Moscow: 1958. 

Gurtovor, G. K., & BunDrANSkKAIA, E. O. 
Visual sensations induced by X-irradiation 
of the eye with doses of the order of one 
milliroentgen. Biophysics, 1959, 4(6), 65- 
71. 


Gurtovor, G. K., & BurDIANSKAIA, E. O. 
Exact dose X-irradiation of various regions 
of the head and visual sensations: X-ray 
location method of study of the reactivity 
of the central nervous system. Biophysics, 
1960, 5, 406—414. (a) 

Gurtovor, G. K., & BURDIANSKAIA, E. O. 
Liminal reactivity of various regions of the 
human retina to roentgen irradiation. 
Biophysics, 1960, 5, 538-543. (b) 

HAEFNER, K. Der Einfluss von Röntgen- 
bestrahlung während der Embryonalent- 
wicklung anf das Labyrintverhalten der 
oe Fortschr. Rüntgenstr., 1960, 93, 648— 

HaLsrEAD, W. C. Brain and intelligence. 
Chicago: Univer. Chicago Press, 1947. 

Harrow, H. F. Effects of radiation on the 
CNS and behavior: General survey. In 
T. J. Haley & R. S. Snider (Eds.), Response 
of the nervous system to ionizing radiation. 
rd York: Academic Press, 1962. Pp. 627- 


HARLOW, H. F., & Moon, L. E. The effect of . 


repeated doses of total-body X-radiation 
on motivation and learning in rhesus mon- 
keys. J. comp. physiol. Psychol., 1956, 49, 


HARLOW, H. F., SCHRIER, A. M., & SIMONS, 


"ree" e 


CAM 


| 
, 
l 


EFFECTS OF IONIZING RADIATIONS 


D. G. Exposure of primates to cosmic radia- 
tion above 90,000 feet. J. comp. physiol. 
Psychol., 1956, 49, 195-200. 

Hastestix, R. J, & Mauxyru, L. D. 
Physical dosimetry and clinical observa- 
tions on four human being» involved in an 
accidental critical assembly excursion. In, 
Proceedings of the International Conference 
on the Peaceful Uses of Atomic Energy. 
Vol. 11. Geneva: United Nations, 1956. 
Pp. 25-34. 

Haymaker, W., Laqueur, G., Nauta, W. J., 
PICKERING, J. E., Scorer, J. C, & VOGEL, 
F. S. The effects of barium 140-lanthanum 
140 (gamma) radiation on the central 
nervous system and pituitary gland of 
macaque monkeys: A study of 67 brains and 
spinal cords and 77 pituitary glands. J. 
Neuropathol. exp. Neurol, 1958, 17, 12- 
51. 

Hess, D. O. The organisation of behavior. 
New York: Wiley, 1949. 

Hess, D. O. Drives and the c.n.s, (con- 
ceptual nervous system). Psychol. Rer., 
1955, 62, 243-254. 

Hess, E. H. Sensory processes, In R. H. 
Waters, D. A. Rethlingshafer, and W. E. 
Caldwell (Eds.), Principles of comparative 
psychology. New York: McGraw-Hill, 
1960. Pp. 74-101. 

Hicks, S. P. Developmental malformations 
produced by radiation: A timetable of 
their development. Amer. J. Roentgenol., 
1953, 69, 272-293. 

Hicks, S. P. The effects of ionizing radiation, 
certain hormones, and radiomimetic drugs 
on the developing nervous system. J. cell. 
comp. Physiol, 1954, 43(Suppl. 1), 151- 
178. 

Hicks, S. P. Radiation as an experimental 
tool in mammalian developmental neurol- 
ogy. Physiol. Rev., 1958, 38, 337-356. 

Hicks, S. P., Brown, B. L., & D'AMATO, 
C. J. Regeneration and malformation in the 
nervous system, eye, and mesenchyme of 
the mammalian embryo after radiation in- 
jury. Amer. J. Pathol., 1957, 33, 459-481. 

Hicks, S. P., D'AMATO, C. J., & Lows, J. L. 
The development of the mammalian nerv- 
ous system: I. Malformations of the brain, 
especially the cerebral cortex, induced in 
rats by radiation, II. Some mechanisms of 
the malformations of the cortex. J. comp. 
Neurol., 1959, 113, 435-469. 

HOLLAENDER, A. (Ed.) Radiation biology. 
New York: McGraw-Hill, 1954. 2 vols. 

Huc, O. D. Die Auslösung von Fühlerre- 
flexen bei Schnecken durch Róntgen und 
Alphastrahlen. Strahlentherapie, 1958, 106, 
155-160. 


System.] 
Fiziol. eksp. Ter., 1959, 3(1), 93-94. (b) 
Izumi, N. Effect of the atomic bomb on school 
children in Urakami District, Nagasaki. 
In, Research in the effects and i — 


Tokyo: Japanese Society for the Promotion 
of Science, 1956. Pp. 1701-1707. 
Jaumet, H., Matie, G., Pexpic, B., DUPLAN, 
J. F., Mauris, B., Latarjet, R., Karic, 
D., SCHWARZENBERG, L., DJUKIC, Z, & 
VicNE, J. [Study of six cases of 
hole-body irradiation.) Rev. Franc. Etud, 


xovÁ, J. Messungen 
ungskurven by Werktitigen unter Ein- 
wirkung ionisierender Strahlen. Arch. 
Gewerbepathol., 1958, 16, 178-183. 
Kocs, R. deae und A 
ysiologische Untersuchungen am t- 
atinan Tier. Fortsch. Rontgenstr., 


1957, 2(3), 3-8. 
Konuma, M., FURUTANI, 


M., & Kuso, S. 
Syndrome of dien: i 


nature as an 


cephalogenic i 
A-bomb sickness sequela. Hiroshima med. 


J., 1957, 5, 369-392. 
Konor'kova, T. A. 


[Electroph 


studies of the effects of ionizing radiation on 


196 ERNEST FURCHTGOTT 


the functional state of the cerebral cortex 
under normal and pathological conditions.] 
Trud. Inst. Vissh. Neron. Deyatel, Ser. 

fiziol., 1959, 3, 121-135. 

Kosmarskaya, E. N., & Barasanev, Y. I. 
[The effect of single irradiation by X rays 
on the developing brain in rats] Med. 
Radiol., 1958, 3(2), 23-31. 

KosvAkov, K. S. [The effect of X rays on the 
concentration of lactic acid, adenosine- 
triphosphoric acid, creatine phosphate and 
inorganic phosphorus in the brains of 
rats] Med. Radiol., 1959, 4(10), 79-80. 

Kosvakov, K. S. [The influence of X rays 
on the catalase activity in the brains of 
mice.] Med. Radiol., 1960, S(5), 76-77. 

Koztov, M. Ya. [Changes in the peripheral 
section of the auditory analyzer in radiation 
sickness.] Vestn. Oto-Rino-Laryngol., 1958, 
20(2), 29-35. 

Koznova, C. B. [Radio-induced olfactory dis- 
turbances in people.) Med. Radiol., 1957, 
2(2), 26-30. 

KRATCHMAN, J., & Grann, D. Relationship 
between the geologic environment and 
mortality from congenital malformation. 
Report No. TID-8204, 1959, United States 
Atomic Energy Commission Technical In- 
formation Service. 

KnoEBEL, W., & Kromm, G. Die Wirkung 
geringer Strahlungsdosen anf die Signal- 
erzeugungs und Fortleiterungseigenschaften 
in Froschnerven. Atomkernenergie, 1959, 
4, 280-286. 

Kupnrrskry, Iv. K. (Changes in the excitabil- 
ity of the motor reflex under a summarized 
effect of small d of roen! rays. 
Vesin. Rentgenol., 1955, 31(6), 15-21. 

Kuznetsova, L. V. [The senses of taste and 
smell in persons working on betatrons.] 
Med. Radiol., 1960, 5(4), 82-84. 

LawrvÉJoL, P. Hervet, E, & Petit, J. 
Malformation foetale et radioactivité. 
Ann. Med. leg., 1957, 37, 179-181. 

Lea, D. E. Actions of radiations on living 
cells. (2nd ed.) ^ Cambridge: Univer. 
Cambridge Press, 1955. 

LEBEDINSKY, A. V. Biological effects of radia- 
tion: U.N. survey. In, Proceedings of the 
Second United Nations International Con- 
ference on the Peaceful Uses of Atomic En- 
ergy. Vol. 22. Geneva: United Nations, 
1958. Pp. 5-16. 

Lesepinsky, A. V., GRIGORYEV, U. G., & 
DEMIRCHOGLYAN, G. G. On the biological 
effect of small doses of ionizing radiation. 
In, Proceedings of the Second United Na- 
tions International Conference on the Peace- 

ful Uses of Atomic Energy. Vol. 22. Geneva: 
United Nations, 1958. Pp. 17-28. 


Lencerovd, A. Effects of intrauterine 
irradiation in rats in relation to the stage 
of em! is at the time of exposure. 
Folia biol., 1957, 3, 321-332. 

Levinson, B., & ZeicLer, H. P. The effects 
of neonatal X irradiation upon learning in 
the rat. J. comp. physiol. Psychol., 1959, 52, 
53-55. 


LirETZz, L. E. Electrophysiology of the X- 
ray phosphene. Radiat. Res., 1955, 2, 
306-329. 

Livanov, M. N. [Changes occurring within 
different parts of the central nervous system 
after exposure to X rays.] In, [Works of the 
All Union Conference on Medical Radiology: 
Experimental Medical Radiology.) (Atomic 
Energy Commission Tech. Rep. No. 3661, 
1959, pp. 30-39. Translation obtainable 
from the United States Department of Com- 
merce, Office of Technical Services.) Mos- 
cow: Trudy, 1957. 

Lrvanov, M. N., & Bigyukov, D. A. Changes 
in the nervous system caused by ionizing 
radiation. In, Proceedimgs of the Second 
United Nations International Conference on 
the Peaceful Uses of Atomic Energy. Vol. 22. 
Geneva: United Nations, 1958. Pp. 269- 
281. 

Lowoxos, P. I. [Conditioned-reflex activity of 
dogs on intravenous administration of radio- 
active cobalt.] In, [Works of the All Union 
Conference on Medical Radiology: Experi- 
mental Medical Radiology.] (Atomic Energy 
Commission Tech. Rep. No. 3661, 1959, pp. 
79-88. Translation obtainable from the 
United States Department of Commerce, 
Office of Technical Services.) Moscow: 
Trudy, 1957. 

McDoweE tt, A. A. Comparisons of distracti- 
bility in irradiated and nonirradiated mon- 
keys. J. genet. Psychol., 1958, 93, 63-72. 

McDowett, A. A. Transfer by normal and 
chronic whole-body irradiated monkeys of a 
single learned discrimination along a pe- 
ripheral cue gradient. J. genet. Psychol., 
1960, 97, 41-58, 

McDoweE 1, A. A., & Brown, W. L. Facilita- 
tive effects of irradiation on performance 
of monkeys on discrimination problems with 
reduced stimulus cues. J. genet. Psychol., 
1958, 93, 73-78. (a) 

McDowe t, A. A., & Brown, W. L. Some 
effects of nuclear radiation exposure on the 
behavior of the rhesus monkey. USAF 
ax Aviat. Med, Rep., 1958, No. 58-58. 

McDowett, A. A., & Brown, W. L. Some 
persisting effects of nuclear radiation expo- 
sure on the behavior of the rhesus monkey. 


EFFECTS OF IONIZING RADIATIONS 


USAF Sch. Aviat, Med. Rep., 1958, No. S8- 
€ 


on an oddity-reversal problem. J. 
Psychol., 1959, 95, 105-110. (a) 

McDowzLL, A. A., & Brown, W. L. Latent 
effects of chronic whole-body irradiation 
upon the performance of monkeys on the 
spatial delayed-response problem. J. gen. 
Psychol., 1959, 61, 61-64. (b) 

McDowzLL, A. A., & Brown, W. L. Transfer 
by normal and chronic focal-head irradiated 
monkeys of a single learned discrimination 
along a peripheral cue gradient. USAF 
Sch. Aviat, Med. Rep., 1959, No. 59-18. (c) 

McDowet, A. A., & Brown, W. L. Com- 
parisons of running wheel activity of normal 
and chronic radiated rats under varying 
conditions of food deprivation. J. genet. 


Psychol., 1960, 93, 139-144. (b) 

McDowett, A. A., & Brown, W. L. Effects of 
repetitious work on performance of normal 
and irradiated monkeys. J. genet. Psychol., 
1962, 15, 15-20. 

McDowett, A. A., & Brown, W. L. Sex and 
radiation as factors in learning performance. 
J. genet. Psychol., in press. 

McDowett, A. A., Brown, W. L., & WHITE, 
R. K. Oddity-reversal and delayed response 
performances of monkeys previously ex- 
posed to focal-head irradiation. J. genet. 
Psychol., 1961, 99, 75-81. 

McDowett, A. A., Davis, R. T., & STEELE, 
J. P. Application of systematic direct ob- 
servational methods to analysis of the radia- 
tion syndrome in monkeys. Percept. mot. 
Skills, 1956, 6, 117-130. 

MaxancHENKO, O. F, & Zrarm, R. 5. 
[Changes in the higher nervous activity of 
dogs under long-term exposure to small 
doses of ionizing radiation.] Fiziol. Zh. 
Akad. Nauk Ukr. R. S. R., 1959, 5, 16-21. 

Maus, L. I., LOEVINGER, R., KRUGER, L., & 
Rose, J. E. Production of laminar lesions in 
the cerebral cortex by heavy ionizing par- 
ticles. Science, 1957, 126, 302-303. 

Meter, G. W. Behavioral irradiation effects 
in the developing chick. Psychol. Rep., 
1959, 5, 3-9. 

Meuzerov, E. S. The influence of whole-body 
X-ray irradiation with fractionated doses 
on the conditioned activity of dogs. Bio- 
physics, 1959, 4, 460-470. 

MICHAILOVA, N. G. [Dependence between the 
dose value of antenatal irradiation and the 


197 


state of higher nervous activity] Mod. 
Radiol., 1900, $(8), 22-26. 

Muses, DosorWEA S Effects of low-level 
radiation on audiogenic convulsive seizures 
in mice. lo T. J. Haley & R. S. Snider (Eds.), 
Response of the nervous system to ionizing 
radiation, New York: Academie Press, 


young individuals to the Hiroshima atomic 
bomb. Pediatrics, 1956, 18, 1-18, 

Moxcas, C. T., & SrerLar, E. Physiological 
psychology. (2nd ed.) New York: MeGraw- 
Hill, 1950. 

MoskovskAvA, N. V. [Effect of ionizing 
radiations on the functions of the vestibular 
analyzer.) Vestn, Otorinolaringol, 1959, 
21(4), 59-62. 

Moroxawa, K., Komata, T., Komatsu, M., 
Cmicumwu, S, Koca, Y., & Kasar, T. A 
sensitive method for detecting the effect of 
radiation upon the human body. Tohoku J. 
exp. Med., 1957, 66, 389-404. 

Moroxawa, K., Umersu, J., Konayasm, M., 

& Kawevama, M. The effect of a small dose 
of roentgen rays upon the human body as 
revealed by the method of electric flicker. 
Tohoku J. exp. Med., 1956, 64, 151-159. 

Murakami, U., & KAMEYAMA, Y. Effects of 
low-dose X-radiation on the mouse embryo. 
Amer. J. Dis. Children, 1958, 96, 272-277. 

Murray, J. E., & Harris, J. D. Negligible 
effects of X-radiation of the head upon 
hearing in the rat. J. aud. Res., 1961, 1, 
117-132. 

NATIONAL ACADEMY OF SCIENCES, Committee 
on the Biological Effects of Atomic Radia- 
tion. Biological effects of atomic radiation. 
Washington: NAS, 1960. 

Overatt, J. E., Brows, W. L., & GENTRY, 
G. V. Differential effects of ionizing radia- 
tion upon “absolute” and relational learn- 
ing in the rhesus monkey. J. genet. Psychol., 
1960, 97, 245-250. 

OvznALL, J. E., Brown, W. L., & Losie, L. C. 
The ES behaviour oye à 
during prolo: exposure to moderate-level 
radiation. Nature, 1960, 185, 665-666. ~ . 

Pare, R, & RIEDL, G. Elektrodermato- 
graphische Befunde nach Zwischenhirnbe- 
strahlung. Strahlentherapie, 1956, 100, 408- 


421. 

Perese, D. M., Mureny, W. T., & PARSONS, 
J. A. Acute radiation changes in the 
Purkinje cells. Radiology, 1958, 70, 855- 
859. 

Proxrkovskry, I. A., & KOLOMEITSEVA, I. A. 
[On certain characteristics of higher nervous 
activity in adult animals after prenatal ex- 


198 ERNEST FURCHTGOTT 


posure to ionizing irradiation: Il. State of 
higher nervous activity in adult rats after 
roentgen ray-irradiation during the 18th 
day of pre-natal development.] Biull. eksp. 
Biol. Med., 1959, 48(12), 25-30. 

PrioxTKovskiv, L A, & KruGLIKov, R. I. 
[Effect of X-ray irradiation of pregnant 
females on the functional state of higher 
divisions of the central nervous system of 
their progeny.] Dokl. Akad. Nauk SSSR, 
1960, 130, 898-900, 

PioNTKOVsKIY, I. A., MIKLASHEvskty, V. 
YE., & Meyerson, F. Z. [Effect of gamma- 
radiation of radioactive cobalt on condi- 
tioned and unconditioned reflexes.] In, 
[Works of the All Union Conference on 
Medical Radiology: Experimental Medical 
Radiology.| (Atomic Energy Commission 
Tech. Rep. No. 3661, 1959, pp. 69-78. 
Translation obtainable from the United 
States Department of Commerce, Office of 
Technical Services.) Moscow: Trudy, 1957. 

POMAZANSKAYA, L. F. The effect of total X- 
ray irradiation on the activity of acid and 
alkaline phosphatase of brain, liver, and 
spleen in rats. Dokl. Akad. Nauk SSSR, 
1960, 132, 1197-1200. 

PorLavsxiv, N. K. [Changes in nervous re- 
flexes in total body X-irradiation.] Med. 


Radiol., 1956, 1(5), 10-13. 
PounQuiER, H., BAKER, J. R., Giavx, G., & 
BENIRSCHKE, K. Localized roentgen-ray 


beam irradiation of the hypophysohypo- 

region of guinea pigs with a 2 
million volt van de Graff generator. Amer. 
J. Roentgenol., 1958, 80, 840-850. 

Pripram, K. H., & Krucer, L. Functions of 
the “olfactory brain.” Ann. N. Y. Acad. 
Sci. 1953, 58, 109-138. 

RiGGs, H. E., McGrarn, J. J., & Scnwanrz, 
H. P. Malformation of the adult brain 
(albino rat) resulting from prenatal irradia- 
tion. J. Neuropathol, exp. Neurol., 1956, 15, 
432-447. 

Ri66s, L. A., CORNSWEET, J. C., & Lewis, 
W. G. Effects of light on electrical excita- 
tion of the human eye. Psychol. Monogr., 
1957, 71(5, Whole No. 434). 

RIOPELLE, A. J., Gropsky, M. A., & Apes, 
H. W. Learned performance of monkeys 
after single and repeated X irradiations, 
J. comp. physiol. Psychol., 1956, 49, 521- 
524. 


Romanorr, A. L. The avian embryo: Structural 
and functional development. “New York: 
Macmillan, 1960. 

RosENzwzIG, M. R., KRECE, D., & BENNETT, 
E. L. A search for relations between brain 
chemistry and behavior. Psychol, Bull., 
1957, 54, 476-492, 


Rose, W. Uber elektroenzephalographische 
Veränderungen nach Róntgenbestrahlung 
des Gehirns. Fortschr. Rüntgenstr., 1959, 91, 
789-798. 

Rucn, R. Vertebrate radiobiology (embryol- 
ogy). Annu, Rev. nucl. Sci., 1959, 9, 493- 
519. 


Rucu, R., & Grup, E. Exencephalia follow- 
ing X-irradiation of the pre-implantation 
mammalian embryo. J. Neuropathol. exp. 
Neurol., 1959, 18, 468-481. (a) 

RucH, R., & Grurr, E. X-irradiation ex- 
encephaly. Amer. J. Roentgenol., 1959, 81, 
1026-1052. (b) 

Ruca, R., & Gnurr, E. Fractionated X- 
irradiation of the mammalian embryo and 
congenital abnormalities. Amer. J, Roent- 
genol., 1960, 84, 125-144. 

RUssELL, Liane B. The effects of radiation on 
mammalian prenatal development. In 
A. Hollaender (Ed.), Radiation biology. Vol. 
l. High energy radiation. New York: 
McGraw-Hill, 1954. Pp. 861-918. 

SAMOYLOYA, L.G. [Concerning certain mecha- 
nisms of the influence of low doses of 
chronic total-body X-irradiation on the 
higher nervous activity and certain vegeta- 
tive functions of white rats.] Med. Radiol., 
1959, 4(8), 13-17. 

SAYENKO-LYUBARSKAYA, U. F. [On the effect 
of ionizing radiation on the human nervous 
system.] Fisiol. Zh. Akad. Nauk Ukr. 
R. S. R., 1959, 5, 261-269. 

ScHWARTZBAUM, J. S., Hunxr, E. L., DAVIES, 
B. P., & Kimetporr, D. J. The effect of 
whole-body X irradiation on the electrocon- 
vulsive threshold in the rat. J. comp. 
Physiol. Psychol., 1958, 51, 181—184. 

SEMAGIN, V. N. [The state of the higher 
nervous activity in rats subjected to daily 
X-ray irradiation at the stage of embryonic 
Sosippianae] Med. Radiol., 1959, 4(6), 16- 
1 


SHARP, J. C. Effects of fetal X irradiation on 
maze learning ability and motor coordina- 
tion in albino rats. J. comp. physiol. 
Psychol., 1961, 54, 127-129. 

SIEKERT, R. G., WiLLIAMS, S. C., & WINDLE, 
W. F. Histologic study of the brains of 
monkeys after experimental electroshock. 
Arch. Neurol. Psychiat., 1950, 63, 79-86. 

Sikov, M. R., & Noonan, T. R. Anomalous 
development induced in the embryonic rat 
by the maternal administration of radio- 
phosphorus. Amer, J. Anat., 1959, 103, 
137-161. 

SKVIRSKAYA, K. B. [The influence of ionizing 
radiation on the nervous system in deleteri- 
ous occupational conditions.] Med. Radiol., 
1961, 6(1), 5-9. 


ve 


EFFECTS OF IONIZING RADIATION 


Smiawova, N. P. [The effect of total X-ray 
irradiation on the thalamus.) Biull, eksp. 
Biol. Med., 1959, 48(9), 38-41. 

Spear, F. G. Radiationt and living cells. New 
York: Wiley, 1953. 

Sram, W. R. A review of Soviet research on 
the central nervous system effects of joniz- 
ing radiations. J. mere. ment. Dis., 1959, 
129, 511-529. 

SranL, W. R. Recent Soviet work on reactions 
of the central nervous system to ionizing 
radiations. J. mere. ment, Dis., 1960, 131, 
213-233. 

Tacxer, R. S., & Furcutoort, E. Low-level 
gamma-irradiation and audiogenic seizures. 
Rad. Res., 1962, 17, 614-618. 

Tacker, R. S., & Furcutcort, E. Adjust- 
ment to food deprivation cycles as a func- 
tion of age and prenatal X irradiation. 
J. genet. Psychol., in press. 

TactENA, M. [X-ray phosphenes] Ces. 
Ofthalmol., 1957, 13, 86-89. 

TaNIMURA, H. Changes of the neurosecretory 
granules in the hypothalamo-hypophysial 
system of rats by irradiating their heads 
with X-rays. Acta anal. Nippon., 1957, 32, 
529-533. 

Teisster, J., & VvskoCm, J. [Visibility ot 
Xrays and use of this phenomenon in 
ophthalmological diagnosis.] Ces. Ofthal- 
mol., 1957, 13, 81-85. 

Tsuki, S., & IkEGAMI, A. Personality tests 
on the atomic bomb exposed children. In, 
Research in the effects and influences of the 
nuclear bomb test explosions. Vol. 2. Tokyo: 
Japanese Society for the Promotion of Sci- 
ence, 1956. Pp. 1709-1714. 

Tsuki, S. et al. Psychiatric investigations of 
persons with a history of exposure to the 
A-bomb. Nagasaki Igak. Z., 1958, 33, 637- 
639. 

Unitep Nations. Report of the United Na- 
tions Scientific Committee on the Effects of 
Atomic Radiations. New York: UN, 1958. 

UPTON, A. C. Ionizing radiation and the aging 
process. J. Gerontol., 1957, 12, 303-313. 

Urner, H. L., & Brown, W. L. The effect of 
gamma radiation on the reorganization of a 
complex maze habit. J. genet. Psychol., 
1960, 97, 67-76. 


ily 


-i 
E 
: 
| 


"ui 
li; 
IRENI 
#55 
tebe: 
FHA 
fp 2 


Yawazaxi; J. N., Bemærr, L. Ry McFALL, 
C. D. Brain 
actirity. of mbblta efr tie ae 


of 
Commerce, Office of Technical Services.) 
Moscow: Trudy, 1957. 

Yarutuy, Ks. [Changes of the higher 
nervous activity in experimental chronic 
radiation sickness induced by ionizing radia- 
tion.) Med. Radiol., 1959, 4(12), 16-21. — 

Zeman, W., Curtis, H. J., GEBHARD, E. L., & 
Haymaker, W. Tolerance of mouse-brain 
tissue to high-energy deuterons. Science, 
1959, 130, 1760-1761. 

Znuvk, E. G. [Observations on the higher 
nervous activity in workers exposed to 
ionizing radiations.] Vo.-med. Zh., 1957, 11, A 
20-23. "Ti. 


t9 


(Received August 7, 1961) = 


Psychological Bulletin 
1963, Vol, 60, No. 2, 200-209 


PHONETIC SYMBOLISM RE-EXAMINED' 
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Johns Hopkins University 


A number of previous observations and experiments in phonetic sym- 
bolism have been reviewed, and their methods of investigation and their 
rationale for the existence or nonexistence of phonetic symbolism sum- 
marized, A series of 3 experiments by Taylor and Taylor were briefly 
presented. The present paper draws readers’ attention to their experi- 
mental result that the same sound is associated with different meanings 
in different languages. The existence of phonetic symbolism on non- 
spatial dimensions is pointed out. A new rationale of phonetic symbol- 
ism has been suggested to account for these results, along with 2 ways to 
test this rationale. Finally, a few applications of these findings in pho- 
netic symbolism have been proposed. 


Use of the term phonetic sym- 
bolism implies a belief that there is an 
intrinsic correspondence between 
sounds and meanings. Certain sounds 
are said to suggest certain meanings 
apart from their conventional mean- 
ings because of some quality inherent 
in the sounds, or because of the ways 
in which the sounds are produced. 
To English speaking people, the 
vowel r in BIT seems to be appro- 
priate to describe something small, 
the vowel A in vast, something large. 
Thus, the nonsense syllable MAL 
sounds bigger than mi. Further, the 
usual hypothesis about phonetic sym- 
bolism is that it is universal in 
scope—i.e., the same sound is asso- 
ciated with the same meaning even in 
historically unrelated languages, 


TRADITIONAL RATIONALE 


There is a rationale for universal 
phonetic symbolism. The sounds we 
use in language have physical prop- 
erties—power, timbre, pitch, dura- 
tion, etc., and objects in the world 


1 The study reported here is a part of a 
PhD dissertation submitted to the Johns 
Hopkins University. 

The author wishes to express her gratitude 
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also have physical properties. Ruben- 
stein and Aborn (1960) pointed out 
that there is a correlation between 
pitch of sounds and size of objects. 
Small objects emit high sounds, and 
large objects emit low sounds. Ac- 
cording to Paget (1930), the Proto- 
Polynesian words for large and small, 
OHO and I-I, are appropriate because 
the mouth forms large and small 
apertures in the two cases. Then 
again, as Jespersen (1922) observed, 
we also have a tendency to lengthen 
and strengthen single sounds under 
the influence of strong feeling or in 
order to intensify the effect of the 
spoken word; thus, in “extremely 
LONG" the vowel o may be length- 
ened. 

The problem is to determine 
whether or not such relationships be- 
tween sound and meaning exist. If 
they do, are the same relationships 
universal, or do they exist only in par- 
ticular languages? And, to what ex- 
tent is there a relation between sound 
and meaning? Is it a function of 
dimensions of meaning? 


OBSERVATIONS SUPPORTING 
PHONETIC SYMBOLISM 


Observations by Jespersen and 
other linguists support the existence 
of phonetic symbolism. Socrates, in 
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Plato's (1892) “Cratylus,” ventures 
that "When we want to express our- 
selves, either with the voice, or 
tongue, or mouth, the expression is 
simply the imitation of that which we 
want to express" (p. 253). Jespersen 
(1922) observed that 

“dull” or "dump" (back) vowels are used in 
English for symbolic expression for dislike, 
disgust, or scorn: blunder, bungle, bung, 
clumsy, humdrum, humbug, strum, slum, slush, 
slubber, sloven, muck, mud, muddle, mug, 
scug, juggins, numbskull, dunderhead, gull. 
Such observations may only be con- 
sidered suggestive since in giving 
these examples of close relation be- 
tween sound and meaning the ob- 
servers might have considered only 
such words as fit their theory. 


EXPERIMENTAL TESTS 
Word Matching 


There have been two types of ex- 
periments on phonetic symbolism. 
The word-matching technique, first 
used by Tsuru and Fries (1933), is 
still the favorite method of investi- 
gating phonetic symbolism: usually, 
lists of antonymic pairs in two differ- 
ent languages are used for test 
material; subjects who know only one 
of the languages (or even neither) are 
asked to match corresponding mem- 
bers of the pairs in the two languages. 
Positive results here mean better 
than chance matching of certain 
words in one language with words of 
the same meaning in another lan- 
guage. The matching experiments 
have led to conclusions both for and 
against the hypothesis of universal 
phonetic symbolism. 

Table 1 summarizes the results 
from a number of word-matching ex- 
periments. From Table 1 it is hard to 
predict when positive or negative 
results will be obtained. The five 
language pairs (Table 1, Footnote b) 
which come from a common language 


family are not necessarily matched 
better than the language pairs with 
no historical relation. One general 
trend in Table 1 is that the more re- 
cently the experiments were done the 
less the degree of positive results. 
Perhaps more recent experiments 
have tended to eliminate sources of 
spurious positive results. 

The following points on the word- 
matching experiments may be con- 
sidered: 

First, if the experimenters know 
both languages, they may select test 
words that resemble each other from 
the two languages. Since only 27 
phonemes on the average (Hocket, 
1958) are used to make words in a 
language, the chance can not be ex- 
cluded that some words of similar 
meaning in different languages may 
resemble each other. This fact may 
explain the unusually high matching 
success found by Tsuru and Fries 
(1933) and by Maltzman, Morri- 
sett, and Brooks (1956) for English- 
Japanese (Maltzman used the same 
word list as did Tsuru). Even if the 
experimenters know only English, still 
they have to choose certain English 
words as test words. Different cri- 
teria for choosing the test words may 
give different degrees of matching 
success. Brackbill and Little (1957) 
used only familiar test words and ob- 
tained three positive and three nega- 
tive results. Brown, Black, and 
Horowitz (1955) and Brown and 
Nuttall (1959) used familiar and 
"sensory" words (e.g., WARM-COOL; 
HEAVY-LIGHT) from English, and ob- 
tained significantly better than 
chance matching in their language 
pairs. In both experiments they took 
additional trouble to make the two 
words in each pair of nearly equal 
length. Brown and Nuttal concluded 
that the “phonetic symbolism effect 
may be limited to pairs naming con- 
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TABLE 1 
Summary or Wogp-MarCHING EXPERIMENTS 


Year Investigator Language pair of Wethm ip aac B 
1933 Tsuru and Fries English-Japanese 75* Yes, phonetic 
symbolism 
exists 
1935 Allport English-Hungarian 56.6 Yes 
1953 Rich English-Japanese $7.2% Yes 
English-Polish? 64.8** 
1955 Brown, Black, and English-Chinese 58.9*** Yes 
Horowtiz English-Czech^ 53,7720 
English-Hindi® 59.6*** 
1956 English-Japanese CR 4.21*** No 
English-Croatian? CR 4.25*** 
Japanese-Croatian CR .46 
Croatian-Japanese CR .80 
1957 Brackbill and Little English-Japanese 50.3 No 
English-Hebrew 53.0** 
English-Chinese 49.9 
Chinese-Japanese 54.8** 
Chinese-Hebrew 48.1* 
Hebrew-Japanese $2i3** 
1959 Brown and Nuttall English-Chinese 60.2*** Yes 
English-Hindi* 60.5*** 
Chinese-Hindi 34.2* 
English-Chinese 5157 
English-Hindi® 51:7** 


$ No significance levels were indicated in the original publications. 
b Language which come from a common language family. 
ca (01) below 50% correct match. print 
22220 

*** p S001 


tinua concerned with magnitude and 
its concomitants.”’ 

Secondly, there is a likelihood of 
selecting a foreign equivalent to 
English by some sort of association 
mechanism. There seldom exists one 
and only one foreign equivalent to an 
English test word. In the process of 
translating English test words, a 
translator (if inadvertently, being 
unaware of the purpose of the experi- 
ment) could choose the foreign equiv- 
alent that more resembles or clang- 


associates with the given English 
word. For example, in translating 
OLD into a Japanese equivalent, a 
translator may choose OITA instead of 
TOSHITOTTA. OITA comes to the 
translator’s mind earlier than TO- 
SHITOTTA because o in OLD makes him 
clang-associate to a Japanese word 
that also has a beginning o. Tsuru's 
list, for example, used OITA for OLD. 
Other similar choices may occur 
wherever direct translation is used to 
generate the words to be matched. 
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Thirdly, these experiments do not 
answer the fundamental question of 
phonetic symbolism, namely, what 
sound has what meaning. Suppose 
the Japanese word mipori and the 
English word GREEN are matched 
correctly in pairs with Japanese AKA 
and English RED. There are a few 
features. that are common in the 
words of these pairs—MiDORI and 
GREEN share two sounds, R and 1, and 
AKA and RED have the same length. 
Now, the question is which if any of 
these common features is identified 
with “greeness”’ or ‘‘redness’’? 
Analytical 

Sapir (1929) and Newman (1933) 
looked for phonetic symbolism using 
an analytical approach: the second 
type of experimental test. English 
speaking subjects judged nonsense 
syllables for their size or brightness 
in a paired-comparison task. These 
nonsense syllables differed only in a 
single vowel. Thus, Sapir's subjects 
compared for size MAL with MEL; or 
MEL with MIL, and so on. 

In their experiments, both Sapir 
and Newman pronounced their test 
words. Such a technique may intro- 
duce bias. According to Eberhardt 
(1940), who used deaf subjects, if the 
deaf were unaware of the meanings 
of TAP and POUND, they spoke the 
two words with little difference—but, 
once they learned the meanings, 
tap became lighter and softer, whole 
POUND took on a richer resonance. 

In these experiments, there is no 
evidence that the same ordering of 
the middle vowels will apply if the 
environmental consonants, for ex- 
ample M-L, are exchanged for other 
consonants such as y-G. Vic might 
conceivably sound bigger than VAG, 
whereas MIL was found to seem 
smaller than MAL. 

Another criticism of these experi- 


ments is that the experimenters used 
only English speaking subjects, which 
means that the results they obtained 
apply only to English. In other 
words, universal phonetic symbolism 
can not be established as long as sub- 
jects of only one language are used. 
A small number of Sapir's subjects 
were Chinese residing in the United 
States, and even these may have been 
influenced by contact with English, 
Dagiri (1958), using Japanese sub- 
jects, modeled his experiment on 
Sapir's and obtained results similar 
to those of Sapir. However, Dagiri 
had only 10 pairs of nonsense syl- 
lables for his test material. 

In all of the above analytical ex- 
periments, the investigators used 


only one or two dimensions of mean- — 


ing—Sapir and Dagiri used size 
only; Newman, size and brightness, 
On the traditional rationale these 
dimensions of meaning might be ex- 
pected to be closely related to the 
tonal properties of sounds. While 
Newman used two dimensions, he 
had the same subjects judge the same 
sounds on two dimensions. This 
technique, however, is not to be rec- 
ommended, for subjects may cor- 
relate their judgments of the sounds 
between one dimension and the 
other. 

Bentley and Varon (1933) con- 


cluded that "there seems to be in- 


sufficient evidence that... sounds 
carry in their own right a symbolic. 
reference," after a series of analytical 
experiments on phonetic symbolism. 

In their first experiment, Bentley 


and Varon presented to three English. 


speaking subjects 10 CVC (consonant- 
vowel-consonant) nonsense syllables. 
One of the experimenters pronoui 

the syllables behind a screen, The 


3 
E 
e 


subjects were instructed to give à .— 
synonym or otherwise to express the /— 


meaning of the heard sound. The — 
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finding from this experiment was that 
"the test sounds did not spontane- 
ously lead to other sounds which de- 
note or hint at spatial magnitude or 
its qualification.” 

In their second experiment (Bent- 
ley & Varon, 1933), 10 “categories” 
(angular, large, etc.) were prescribed 
to the subjects. The same three sub- 
jects from the first experiment were 
to report whether a test sound im- 
plied some degree of the first or of its 
opposite (e.g., large or small). Their 
result was that “ʻa little less than one- 
third of the whole number of Ss' re- 
ports indicate that to the nonsense 
syllables used the quality expressed 
in the category-words was applied." 

In their third experiment, five cate- 
gories and their opposites were used 
with nonsense syllables in pairs. The 
subjects were to report which (if 
either) of the two nonsense syllables 
more closely approximated the mean- 
ing of the category word just given. 
In other words, the subjects were left 
free to discover that either or neither 
of the nonsense syllables applied to 
size, strength, and the other category 
words. In the 300 pairings, some de- 
gree of relationship was reported for 
165 items and for all five categories. 

The results of Experiments I, II, 
and III can be explained on the as- 
sumption that some sounds may be 
associated very well with some di- 
mensions of meaning (or category) 
but not at all, or slightly, with other 
dimensions. Perhaps, the limited 
number of sounds and categories used 
in these experiments were some of 
those more “‘unsuitable’’ instances. 

In their first experiment, the inves- 
tigators included A and 1 sounds 
which were reported by Sapir and 
Newman to have distinct size con- 
trasts. However, there is no indica- 
tion of any systematic control over 
the combination of sounds used in 
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manufacturing nonsense syllables in 
those experiments. Thus, there is no 
guarantee that the subjects were in- 
deed responding only to the sounds to 
which the experimenters expected 
them to respond. For example, the 
subjects might have been responding 
to the environmental consonants 
when the experimenters wanted them 
to respond to A and 1 sounds only. 

In all of the above three experi- 
ments, the same three (only three) 
subjects were used, which reduces the 
generality of the results. 

In their fourth experiment, Bentley 
and Varon used 26 subjects and 36 
nonsense syllables in pairs (e.g., FAM, 
JAF, and MAF were, respectively, 
paired with FIM, JIF, and mir). Con- 
sonant variations were used with 
each vowel. Three categories (angu- 
lar, large, and hard) and their oppo- 
sites (round, small, and soft) were 
used. Each time (a) a pair of non- 
sense syllables (FAM-rrw); (b) object, 
action, or quality word (CABIN); 
and (c) category word (which is 
the larger cabin?) were uttered in 
order. The A sounds were larger than 
I sounds in the approximate ratio of 
4:1. For angularity and roundness, 
the ratio was 3:1, A sounding round 
and r angular. For hardness, A was 
softer than r in a ratio of 2:1. Thus, 
"positive results" (results that are 
comparable to the results of Sapir 
and Newman) were obtained in Ex- 
periment IV where “degree in some 
scale" was suggested and pre- 
scribed. 

The authors state that the above 
finding does not prove that the ‘‘mag- 
nitude dimension came out of the 
nonsense syllables in the form of a 
phonetic symbolism." They rather 
think that “it only shows that, under 
conditions, the Ss apply, with more 
or less consistency, 'some sort of dif- 
ference' in the two nonsense syllables 
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to the problem proposed by the ex- 
perimenter.”’ 

The authors do not clarify in the 
above statement what this “some 
sort of difference” is, and how it came 
to exist. In other words, they refuse 
to call the subjects' tendency of asso- 
ciating A with largeness, softness, and 
roundedness and 1 with the opposites 
of these dimensions as phonetic sym- 
bolism, but their reasons for such a 
refusal are not clear. Nor are there 
any satisfactory alternate explana- 
tions given by the authors for their 
findings of Experiment IV. 

Finally, the same criticisms leveled 
against the Sapir and Newman exper- 
iments—criticisms of the method of 
presentation, the kind of subjects, and 
the subject's task are applicable to 
the Bentley and Varon studies. 

Taylor and Taylor (1962) used an 
analytical technique to investigate 
phonetic symbolism in four dimen- 
sions (size, movement, pleasantness, 
and warmth). They used subjects 
who were monolinguals in each of 
four historically unrelated languages 
(English from the Indo-European 
language family, Japanese from the 
Japanese language family, Korean 
from the Korean language family, 
and Tamil from the Dravidian lan- 
guage family). Their CVC test syl- 
lables used all the consonants that 
are common to all four languages, as 
well as six vowels which are similar 
in each language. 

They performed a series of two pre- 
liminary experiments with similar 
general procedures. In these two ex- 
periments English speaking mono- 
linguals judged typed nonsense syl- 
lables for size on a five-point rating 
scale. Interactions among sounds, 
visual length of syllables, the number 
of syllables in test words, and stress 
in two-syllable words were investi- 
gated. To investigate interactions, 


they used all the CVC nonsense 
syllables which could be made by 
permuting six consonants and three 
vowels. An analysis of variance 
showed that there was no interaction 
among the letters of a syllable, but 
that each contributed independently 
of theothers. In the third experiment 
they took into consideration the rele- 
vant points indicated by the series of 
two experiments. On the whole, the 
results of the two experiments showed 
that phonetic symbolism existed in 
English, and that the technique was 
suitable for its investigation. Par- 
ticularly, the lack of interactions 
among CVC sounds permitted Latin 
square type designs, and analysis 
of the data directly in terms of 
initial consonant, middle vowel, and 
final consonant. In the third, and 
final experiment, Taylor and Taylor 
used monolingual subjects speaking 
English, Japanese, Korean, and 
Tamil. The six vowels and 12 con- 
sonants that are legitimate phonemes 
in all these four languages were made 
into 144 CVC nonsense syllables in a 
pseudo-Latin square design. These 
144 nonsense syllables were written 
in the four different languages and 
administered in the countries where 
the languages are spoken. The test 
syllables were judged on four dimen- 
sions of meaning: size (big-small), 
movement (active-passive), pleasant- 
ness  (pleasant-unpleasant), and 
warmth (warm-cool). 

The results of these three experi- 
ments indicated that people associate 
certain sounds with certain meanings, 
but the same sound is associated with 
different meanings in different lan- 
guages. For example, English speak- 
ing subjects show a high degree of 
agreement in ranking the initial con- 
sonants on the size dimension with 
G and Kas the big sounds and T and 
N as the small sounds. However, 
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Korean speaking subjects, though 
they too show a high degree of agree- 
ment among themselves in ranking 
the same initial consonants for size, 
consider T and P to be the biggest and 
J and M to be the smallest. The same 
sort of result holds with the vowel 
and the final consonants in CVC syl- 
lables. In general there was no cor- 
relation among the rankings from 
different languages. Nor was any 
correlation found among the rankings 
of sounds for the different dimensions. 
This lack of correlation of rankings on 
different dimensions means that 
Brown and Nuttall's (1959) conjec- 
ture that phonetic symbolism may 
exist only on magnitude-like con- 
tinua does not hold. Further, it 
demonstrates that subjects are all 
truly responding to some quality in 
sound which is connected with the 
test dimension. 


A New THEORY or PHONETIC 
SYMBOLISM 


The results of the analytical experi- 
ments indicate that people do asso- 
ciate a certain sound with certain 
meanings. However, recent cross 
language experiments indicate that 
the same sound is differently asso- 
ciated in different languages. 

What is the basis of such an asso- 
ciation between sound and meaning? 
The rationale postulated by previous 
investigators and mentioned in the 
introduction of the present paper is 
not adequate in view of the more 
recent findings. If we look only at 
the English results (upon which the 
old rationale was based) in the data 
of Taylor and Taylor, the traditional 
rationale works. In vowels, I and E 
with their high pitch and small oral 
cavity are given the smallest size 

scores, while o and u with their large 
oral cavity and low vocalic resonance, 
the biggest size scores. 
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However, it is difficult to explain 
with this rationale the finding that 1 
got the second largest, and v the 
second smallest size scores in Tamil, 
unless small objects emit low sounds 
in South India and Ceylon. Then, 
there are the results of Eberhardt's 
(1940) experiments with deaf chil- 
dren to be accounted for. Eber- 
hardt, using the Sapir-Newman pro- 
cedure with English reading deaf 
children obtained similar orderings of 
English vowels as for hearing (Eng- 
lish speaking) subjects. The deaf 
subjects obviously could not have 
based their judgment on the tonal 
quality of the sounds. Thus, it is 
clear that the old interpretation of 
phonetic symbolism, based mainly on 
results from English subjects, has to 
be revised in the light of the available 
evidence. 

First of all, the concept of phonetic 
symbolism has to be modified. An ex- 
planation for phonetic symbolism 
must be sought somewhere other 
than in the relation between linguis- 
tic or physical properties of sounds 
and objects in the world. A new 
hypothesis must be found that ac- 
counts not only for the fact that peo- 
ple associate certain sounds with cer- 
tain meanings, but also the fact that 
people speaking different languages 
associate the same sounds with dif- 
ferent meanings, 

If different phonetic symbolism 
patterns exist in different languages, 
some factors that are different from 
one language to another unrelated 
language must be the chief variable 
for phonetic symbolism. Among such 
factors, different language habits can 
be considered. For example, people 
speaking English may develop the 
habit of associating G— with bigness, 
because in English G— is often used 
for words meaning very big, such as 
GRAND, GREAT, GROW, GAIN, GAR- 
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GANTUAN, GROSS, to name some. On 
the other hand, T— and N— are asso- 
ciated with smallness, probably be- 
cause in English some frequently 
used “small words" start with T—: 
TINY, TIT, TOM, TRIFLE, TINGE, TOM- 
TIT, etc. Now, in Korean, the same 
T— is associated with bigness, and 
T— occurs in the Korean word TAE 
meaning great or very big. J which 
got the smallest size score in Korean 
data, on the other hand, begins the 
Korean word JAGGEUN which means 
small. In Japanese, such big words 
as DEBU (fat), DEKKAI (huge), or 
DAIKIBO (on grand scale) start with 
D, which obtained the largest size 
score in Japanese. 

Newman (1933) found in his ex- 
periment that phonetic symbolism in 
his subjects became more consistent 
with increased age (up to the age 
of 11). This finding of Newman is 
expected by the above hypothesis of 
language habits as the chief variable 
for phonetic symbolism: as language 
habits get solidified with increased 
age, one of the products of these 
habits, phonetic symbolism, has to 
become consistent. 

Once a certain sound has thus be- 
come associated with a certain mean- 
ing, then within that language a 
cluster of words of similar meaning 
may come to employ similar sounds. 
Some such examples from English 
are: words meaning rapid movement 
and having FLI—: FLICK, FLIP, FLIT, 
FLITTER, FLICKER, FLING. Some exam- 
ples of words meaning gentle (in 
slope), or slow in Japanese and 
having similar sounds are: YURU- 
YURU, YUKKURI, YUTTARI, YURURI, 
YURUYAKA, YURUI, etc. 

One possible way to test the above 
interpretation is to ask a number of 
people to give monosyllabic small 
words and monosyllabic big words, 
and compare initial consonants' dis- 
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tribution over many such responses. 
For example, we would be interested 
in finding out whether or not English 
speaking subjects give more big words 
starting with G and x, the two big 
“size” initial consonants, than “big” 
words starting with T or N, the two 
small "size" consonants. We could 
look at the small words, too, and see 
if more small words start with T and 
N, than with G or K. 

Another way to test this explana- 
tion is to count the number of size 
words in a language with each ini- 
tial consonant. Newman (1933) pre- 
pared such a list of English words in 
his attempt to determine whether 
English words generally reflected his 
results. Newman gathered lists of 
words from Roget’s Thesaurus under 
the rubrics of greatness, smallness, 
size, and littleness. After striking out 
repetitions, derivatives, and phrases, 
the list (by then consisting of about 
500 words) was given to 11 judges 
who checked the words that did not 
conform in meaning in the ‘‘denota- 
tive large-small words," ie, the 
words that were obviously not size 
words. The final list was taken from 
a majority vote of the judges. New- 
man counted in the size words of the 
final list the frequency of the sounds 
with different size scores which he 
obtained in his experiments. New- 
man concluded from this process that 
the actual distribution of sounds in 
the two size categories is fairly 
random. 

Newman's size word list may be 
re-examined in the light of the results 
found by Taylor and Taylor (1962). 
The words found by Newman to be 
large or small, and with an initial 
consonant of T, P, G, or K are shown 
in Table 2. It can be seen that there 
is a tendency for Newman's words be- 
ginning with T or N (the two smallest 
in the Taylor and Taylor study) to 
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TABLE 2 
ENGLISH SIZE WORDS FROM THE NEWMAN List WITH INITIAL CONSONANTS OF G, K, T, AND N . 


Big size scores 


Initial consonants with different size scores from Taylors' study 


Small size scores 


Size 
G K 
(as in GATE) (as in KING) 
GARGANTUA CAPACITY TERRIBLE NOBLE 
GLARING CARGO TOWER NOTABLE 
GOODLY COLOSSUS TRANSCEND NOTEWORTHY 
GRAVE COMPREHENSIVE TREMENDOUS NUMBER 
Large GREAT CONSIDERABLE TRUTH 
GROSS CONSUMMATE 
CORPORATION 
CORPULENCE 
CRASS 
QUANTITY 
TAG NARROW 
CONTRACT TATTER NEAR 
GRAIN CORPUSCLE TENUOUS NUTSHELL 
GRUB CRUMB TENDER GNAT 
Small CRUMBLE TINCTURE 
TINGE 
TINY 
TIT 
TITTLE 
TOMTIT 
TOUCH 
TRIFLE 


oe Se a 


be small and for those beginning with 
G or K (the two largest) to be large. 
This tendency is significant with p 
smaller than .05 according to a chi 
square test. 


SoME IMPLICATIONS OF RECENT 
FINDINGS IN PHONETIC 
SYMBOLISM 


The experimental method of Tay- 
lor and Taylor could be used as one of 
the objective and quantitative meth- 
ods of determining the degree of 
relatedness between any pair of 
languages. If the hypothesis pro- 
posed above is true, we can interpret 
a significant correlation between two 
phonetic symbolism patterns of any 
pair of languages as reflecting the 
existence of similarity between the 
two language habits concerned. In 


other words, the two languages may 
share similar sounding words for the 
same meaning, like German and 
English which share G— for big 
words (GROSS, GROSSARTIG, GEWIN- 
NEN, Gort, in German). For the 
middle vowel, Jespersen’s (1922) 
numerous examples from various 
Indo-European languages can be 
cited—French, petit, Italian, PIC- 
COLO, PICCINO; German KIND; Dan- 
ish, PILT; Spanish, cnco; Latin, 
QUISQUILIOE, MICA; English, TIP, PIN, 
CHINK, SLIT, KID, CHIT, LITTLE, 
MIDGE, BIT, CHIP, WHIT. In short, 
related languages should show similar 
patterns of phonetic symbolism as 
compared with the phonetic symbo- 
lism patterns of languages from other 
language families. The validity of 
this suggestion may be tested by 
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studying a number of languages 


. whose relations are already well 


known. The Indo-European lan- 
‘guage family is such an example. 

If we obtain phonetic symbolism 
patterns in English, German, Rus- 
sian, and one non-Indo-European 
language, there should be a hierarchy 
in the degree of relatedness among 
those languages as reflected in pho- 
netic symbolism. The highest degree 
of relatedness should be between, for 
example, English and German, two 
languages that come from the same 
language branch within the same lan- 
guage family. The next highest de- 
gree of relatedness should exist be- 
tween English (or German) and Rus- 
sian, which share the same language 
family but not the same branch. No 
relation ought to be found between 
English (or German or Russian) 
and a non-Indo-European language, 
which are historically unrelated. 
Note that the proposed technique 
only finds the existence and a certain 
pattern of relationships between a 
language pair (or among languages), 
but does not explain how such a rela- 
tion has come about. 

The extent and pattern of phonetic 
symbolism in different languages can 
be utilized in advertisements or in 
poetry writing in each language. For 
instance, a name for a new cool bever- 
age product can be coined by com- 
bining the c—, —v—, and —c with 
the respective smallest warmth score. 
The result will be something like 
KOR for English. Now, all this word 
coining effort will be pointless if an 
experiment does not show that the 
word coined in this principle appeals 
more to consumers, hence is remem- 
bered better, causing more people to 
buy the product than the word 
coined without a principle. 

In English, and to a far lesser de- 
gree in Japanese, nonsense syllables 
have been extensively explored and 
used in many verbal-behavior or 


learning experiments. The study of 
Taylor and Taylor demonstrates that 
nonsense syllables can be meaning- 
fully used by naive subjects in vari- 
ous cultures with different languages. 
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DELAYED AUDITORY FEEDBAC 


AUBREY J. YATES! . 
University of Western Australia 


When S hears his own voice with a small time delay his speech may be 
seriously affected. The effects produced by delayed auditory feedback 
(DAF) include prolongation of vowels, repetition of consonants, in- 
creased intensity of utterance, and other articulatory changes. The sig- 
nificance of individual differences in susceptibility to DAF is considered 
in relation to personality and physiological characteristics. The tech- 
nique may prove useful in the detection of auditory malingering and 
has possible implications for the understanding of stammering. 


discussion relates the findings to models of speech control. Methodo- 


logical problems and future research needs are outlined. 


It has long been debated whether 
the successful regulation of skilled 
response patterns is dependent upon 
the continuous monitoring of the on- 
going processes by means of feedback 
mechanisms. In the case of speech, it 
would seem to be necessary for the 
subject (S) to be repeatedly informed 
of the extent to which the skilled re- 
sponse pattern is proceeding smoothly 
so that appropriate corrections can be 
inserted into the sequence, where 
necessary. The appropriate informa- 
tion in the case of speech is derived 
from at least three sources: kinesthetic 
and proprioceptive feedback from 
changes in the muscular and sensory 
apparatuses involved in speaking and 
listening; auditory feedback trans- 
mitted via the bony structures of the 
organism, particularly the bones of 
the head; and auditory feedback 


! For assistance in obtaining certain refer- 
ences the author thanks C. J. Atkinson, J. W. 
Black, G. F. Bond, R. A. Chase, D. G. 
Doehring, D. G. Ellson, E. W. Gibbons, G. J. 
Harbold, R. W. Peters, B. M. Siegenthaler, 
S. Sutton, and G. C. Tolhurst. R. A. Chase 
materially assisted the preparation of this re- 
view in many ways, and with J. Rossand A. J. 
Marshall offered many valuable criticisms. 


transmitted through the air to the 
speaker's own auditory reception 
apparatus. In normal speech these. 
three sources of information supple- 
ment each other and are presumably 
integrated at higher neural levels in 
the cortex. 

It has also long been known 
(Cherry & Sayers, 1956) that inter- 
ference with the natural relationships 
between ongoing speech and the con- 
sequent feedback of information 
could lead to severe disturbances in 
the smooth progress of speech, but it 
was not until the observations of Lee 
(1950a, 1950b, 1951) were published 
in America that interest in the de- 
tailed examination of the phenomenon 
quickened. Essentially, a situation is 
arranged such that S hears his own 
voice through headphones with a de- 
lay of about one-fifth of a second, 
usually while reading aloud a con- 
tinuous prose passage. Undersuch 
conditions, many Ss show a remark- 
able deterioration of speech fluency, 
together with other phenomena which 
will be described. The phenomenon 
has been variously called delayed 
auditory feedback, delayed speech 
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feedback, and delayed sidetone. The 
term “delayed auditory feedback" 
(DAF) will be used in this review, 
since the effects of delay are not con- 
fined to speech, while the term side- 
tone has special meaning in engineer- 
ing. 
Since this review is concerned 
mainly with the production of ab- 
normal patterns of speech by means of 
delayed feedback, it should be pointed 
out that alterations of feedback may 
often either facilitate rate of speech, 
or lead to increased articulation 
clarity; in fact it has been suggested 
that these techniques may be used to 
improve speech. Thus, intelligibility 
of speech has been demonstrated to 
increase when S speaks in noisy con- 
ditions (Butler & Galloway, 1957); 
when the high frequency components 
of airborne feedback are attenuated 
(Peters, 1955); and when airborne 
feedback is binaurally occluded (Black 
& Tolhurst, 1956). Black (1950) 
showed that level and duration of 
speech are dependent upon room size 
and shape. The speaker, in other 
words, adjusts both the level and 
precision of his speech under changed 
acoustic conditions to produce the 
most efficient communication pos- 
sible. It would seem, furthermore, 
that the feedback mechanism sets 
limits to the rate at which normal 
speech can proceed, since it has been 
shown that artificially increasing the 
rate of airborne feedback enables S to 
speak more rapidly (Davidson, 1959; 
Peters, 1954). 


Propuction or DAF 


The delay of airborne feedback 
may be produced by the use of a 
magnetic tape recorder, modified so 
that it contains a fixed playback and 
a movable record head, or vice versa. 
The S's voice production is recorded 
at the record head, delayed by an 


interval dependent on the distance 
between record and playback heads 
(at a constant tape speed) and then 
transmitted via the playback head 
to. S's headphones so that it is heard 
with the desired delay. A continuous 
loop of tape enables the high tape 
speeds necessary to be achieved, while 
an erase head ensures that the tape is 
clear when it again reaches the record 
head. In this way, Fairbanks and 
Jaeger (1951) were able to obtain 
delays up to .90 sec. at the relatively 
slow tape speed of 15 inches per sec. 
Tiffany, Hanley, and Sutherland 
(1954) obtained delays from .14 to 
1.40 sec. It will be clear that varying 
delays can be obtained either by vary- 
ing the distance between record and 
playback heads, or varying the tape 
speed, or both. As Tiffany et al. 
(1954) point out, a satisfactory piece 
of apparatus should allow for a wide 
range of delay times and continu- 
ously variable delay. Detailed de- 
scriptions of the apparatus required 
may be found in the two papers just 
mentioned. 

Various other refinements may be 
added to the basic apparatus de- 
scribed above, both on the stimulus 
and response side. Thus, it may be 
desirable to control the intensity of 
feedback at S's ear. This may be 
achieved most simply by an auto- 
matic volume control, interpolated 
between the playback head and the 
speaker's headphones. 

The average auditory feedback 
delay under normal conditions is con- 
sidered to be about .001 sec. and it 
has already been noted that it is pos- 
sible to shorten, as well as lengthen, 
this delay. Peters (1954), by means 
of elaborate electronic tubes, ob- 
tained shortened delay times of .0003 
and .00015 sec.; but a delay of .0005 
sec. was achieved by Davidson (1959) 
simply by placing the microphone at 
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the right corner of the speaker's 
mouth. A slightly longer than normal 
delay (.0015 sec.) was obtained if the 
microphone were placed 12 inches 
from and directly in front of the 
speaker's mouth. While Davidson's 
technique is probably not as reliable 
as that of Peters, and permits only 
small variations from normal delay 
times, it produced results very similar 
to those of Peters. 


MATERIALS 


The task set S under DAF has 
varied widely. At one extreme, 
Chase, Harvey, Standfast, Rapin, and 
Sutton (1959) investigated the effect 
of DAF on the repetition of the sound 
[b] as in "book." Black (1951) 
utilized sets of five-syllable phrases 
carefully matched for characteristics 
such as equivalence of natural inten- 
sity induced by reading them in nor- 
mal conditions. Prose passages have 
been most commonly used, varying 
from relatively uncontrolled material 
of varying lengths (Fairbanks, 1955; 
Spilka, 1954b; Tiffany & Hanley, 
1952) to passages which have been 
phonetically balanced (Spilka,1954a), 
equated for difficulty level (Win- 
chester, Gibbons, & Krebs, 1959), or 
chosen so as to contain all English 
speech sounds (Arens & Popplestone, 
1959). At the other extreme, even the 
content has been indeterminate, as 
when S is asked to say nursery rhymes 
(Beaumont & Foss, 1957), or first 
say, and then explain the meaning of, 
simple proverbs (Korowbow, 1955). 

Butler and Galloway (1957, 1959) 
used five two-digit numbers which 
were successively flashed at random 
in one of five different positions on a 
screen at varying rates of presenta- 
tion. The advantage of this tech- 
nique lies in the fact that it solves the 
problem of controlling the structure, 
content, and different "natural" in- 
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tensity levels of words and phrases; 
eliminates variance in speaking rate 
within a test passage; and avoids the 
possibility of S counteracting the 
effects of the delayed signal by con- 
centrating on the content. 


INDEPENDENT VARIABLES 


The principal independent vari- 
abies utilized have been the delay 
time and the intensity level of the 
feedback at the speaker's ear. 


Delay Time 


The study by Black (1951) is rep- 
resentative. He used delay intervals 
varying from zero to .30 sec. by .03- 
sec. intervals. Fairbanks (1955) used 
intervals of zero, .10, .20, .40, and .80 
sec. In most cases, intensity level of 
feedback has been held constant at a 
given value while delay time has been 
varied, but Butler and Galloway 
(1957) used four delay times and four 
intensity levels in a factorial design, 
while Atkinson (1954) used 10 delay 
times and three intensity levels. It 
may be noted that different groups 
of Ss may be used for each combina- 
tion of delay and intensity, or S may 
be used as his own control, experienc- 
ing each combination successively. 

As has already been pointed out, 
two studies have shortened the feed- 
back delay (Davidson, 1959; Peters, 
1954) while Butler and Galloway 
(1957) used “random” delay, i.e., 
playing back a recording of S reading 
under zero delay while he was reading 
a second passage. 


Intensity of Feedback 


Several different criteria have been 
used in specifying this variable and 
would appear to account in part for 
discrepant results. The intensity level 
may be defined in a purely physical 
way, without reference to S. Thus, 
Atkinson (1954) presented the feed- 
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back at 0, 10, and 20 db. above a con- 
stant 75-db. noise level in the head- 
phones. Peters (1954) presented the 
feedback 0 and 5 db. below and 5 and 
10 db. above normal sidetone pressure 
level. More commonly, however, the 
intensity has been related to either 
the threshold for speech reception 
(SRT) (Hanley & Tiffany, 1954b; 
Tiffany & Hanley, 1952) or the thresh- 
old for speech detection (SDT) 
(Butler & Galloway, 1957). Since the 
former threshold is certain to be 
higher than the latter, it follows that 
an intensity level 75 db. above SRT 
produces a higher physical level than 
one which is 75 db. above SDT. 
Tiffany and Hanley (1956) have also 
utilized the spondee recognition 
threshold as a baseline. 

It has been usual to maintain delay 
time constant while varying intensity 
level. Intensity level at the head- 
phones has varied from 10 to 75 db. 
above SRT and from 20 to 80 db. 
above SDT. 

In addition to the two major in- 
dependent variables, Winchester and 
Gibbons (1957) contrasted various 
modes of presentation of DAF. With 
constant delay and intensity level, 
they presented DAF binaurally; uni- 
aurally but without masking of the 
other ear; uniaurally but with the 
other ear masked; and without feed- 
back or masking. In the latter case S 
wore headphones but received no 
feedback through them. 

The only other modification of 
consequence is that utilized by 
Hanley, Tiffany, and  Brungard 
(1958) who presented DAF in bursts 
rather than continuously. 


DEPENDENT VARIABLES 


In this section we shall present the 
main findings related to changes in a 
number of dependent variables when 
the principal independent variables 
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are manipulated. The modifying 
influence of related independent in- 
tervening variables will be presented 
later. 


Duration of Phrase 


This refers to the time taken to 
read a standard phrase (Black, 1951) 
or a passage of prose, or any of the 
materials described earlier. If length 
of passage is divided by time taken, 
then a measure of reading rate is ob- 
tained. Hanley and Tiffany (1954b) 
calculated a mean rate reduction 
score, this being the time to read the 
passage under normal conditions 
minus the time to read it under de- 
lay. Various other measures, such as 
syllable duration time (Spilka, 
1954b), percentage phonation time 
(Fairbanks, 1955; Spilka, 1954b), or 
time to make each verbal response in 
a free-responding situation (Korow- 
bow, 1955) may also be included here. 

Duration of phrase as a function of 
delay. Black (1951) measured the 
time to read five-syllable phrases as a 
function of delay times ranging from 
.03 to .30 sec. with intensity at head- 
phones constant. Hefound that dura- 
tion increased as a function of delay 
up to .18 sec. then declined; that, 
whereas the general trend was linear, 
there was a discrete increment at .06 
sec. and that even the smallest delay 
produced a measurable increase in 
duration. Atkinson (1953) confirmed 
Black's findings while Fairbanks 
(1955), using a single sentence, found 
that total sentence duration and 
mean duration of phonations (unin- 
terrupted periods of phonation) 
showed growth and decline character- 
istics similar to those of Black with a 
peak at .18 sec. 

Discordant results were, however, 
obtained in a carefully controlled 
study by Spilka (1954b). He found 
that syllable duration and percentage 
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phonation time both lengthened un- 
der DAF compared with no delay, 
but he could find no differential 
effect of delay upon those variables. 
Spilka's study did differ in important 
respects from that of Black, however. 
Different Ss were used for the differ- 
ent delay conditions, relatively long 
prose passages were read, and the 
intensity level at the headphones was 
120 db. for the feedback condition. 

Conversely, both Peters (1954) 
and Davidson (1959) have demon- 
strated that speeding up the feedback 
rate by methods indicated earlier in 
this paper leads to a decrease in mean 
duration, that is, facilitates rapid 
speech. 

Duration of phrase as a function of 
intensity. Two studies (Hanley & 
Tiffany, 1954b; Tiffany & Hanley, 
1952) have assessed the effect on 
reading rate of various levels of in- 
tensity of feedback. The results of 
both studies indicated that a reduc- 
tion in reading rate accompanied an 
increase in intensity of feedback, the 
relationship being roughly linear. 

Interaction effects. Butler and 
Galloway (1957) used a factorial 
design involving four delay times and 
four intensity levels, different Ss 
being assigned to each condition. 
They found a significant interaction 
effect between delay and intensity: 
at 50 db. intensity, delay times 
showed no differential effects, all 
being equally effectively different 
from synchronous feedback; whereas 
at 80 db. intensity, a differential 
delay effect was present with .17 sec. 
producing most errors. These results 
are in agreement with those of Black 
(1951) who used a high intensity level 
and with those of Tiffany and Hanley 
(1952), if allowance is made for the 
latter’s use of SRT from which to 
measure intensity. The results of 
Butler and Galloway (1957) also sug- 
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gest an explanation for Spilka's fail- 
ure to find differential delay effects. 
Possibly, there is an optimal range of 
intensities, within which various de- 
lays will be differentially effective; 
outside these limits (on the high or 
low side) differential delay effects 
may be swamped by direct intensity 
effects at all delays. 


Intensity of Ulterance 


DAF produces changes in the in- 
tensity of utterance, or sound pressure 
level of speech, as it is alternatively 
called. Black (1951) found that mean 
intensity of response increased as a 
function of increased delay up to .09 
sec. delay, and then remained con- 
stant, and that even the smallest 
delay produced a measurable increase 
in intensity. Once again, his results 
were confirmed by Atkinson (1953) 
and once again Spilka (1954b) found 
that, although both mean vocal 
intensity and variance of vocal inten- 
sity increased under DAF, the pattern 
of change was not in agreement with 
the results of Black. Fairbanks (1955) 
found a constant increase of 10-12 db. 
over the entire range of delay times 
studied, but these findings of Spilka 
and Fairbanks do not necessarily 
conflict with those of Black, since 
the shortest delay time used by 
Fairbanks was .10 sec., while the 
shortest used by Spilka was .094 sec. 


Fundamental Frequency 

Fairbanks (1955) found a rise in 
fundamental frequency from 109.5 
cps at zero delay to about 130 cps at 
all delay levels from .10 to .80 sec. 
No differential delay effect was ap- 
parent but he did not investigate 
delays shorter than .10 sec. 


Intelligibility 
The speech of S under DAF may be 
presented under noisy conditions to 
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listeners who are required to make 
ratings of intelligibility of the speech. 
Atkinson (1954) found a decrease in 
intelligibility as intensity of DAF was 
increased, nonsystematic changes as 
delay was varied, but no interaction 
between delay and intensity. David- 
son (1959) found that panels of judges 
could not detect any change in 
intelligibility of speech under slightly 
longer (.0015 sec.) or shorter (.0005 
sec.) delays. 


Articulatory Changes 


The changes in speech rate and 
intensity previously described may be 
regarded as deriving from more basic 
articulatory changes, or both these 
changes may be regarded as deriving 
equally from some even more basic 
factor. The types of articulatory 
change which have been noted under 
DAF include the following: repetition 
of syllables and continuant sounds 
(Atkinson, 1953; Fairbanks & Gutt- 
man, 1958; Lee, 1951; Tiffany & 
Hanley, 1956), mispronunciations 
(Atkinson, 1953), omissions (Tiffany 
& Hanley, 1956; Fairbanks & Gutt- 
man, 1958), substitutions (Fairbanks 
& Guttman, 1958), number of word 
endings omitted (Korowbow, 1955), 
percentage of correct words (Fair- 
banks & Guttman, 1958). Clearly, 
some of these measures overlap in 
meaning. Tiffany and Hanley (1956) 
derived a measure of general speech 
effectiveness and Fairbanks andGutt- 
man (1958) a measure of general 
articulatory accuracy. Deserving of 
special mention are the discovery by 
Korowbow (1955) that intrusions 
diminished under DAF, and the find- 
ing of Fairbanks and Guttman (1958) 
of an interaction between delay and 
type of error. Number of omissions 

doubled as delay changed from zero 
to .2 sec., but additions became 20 
times as common. The sole discord- 
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ant finding is that of McCroskey 
(1956). He found no change in four 
measures of mean number of correctly 
articulated words from no delay to .18 
sec. delay. In view of Atkinson's 
(1954) failure to find any interaction 
between delay and intensity for 
intelligibility ratings, it is unlikely 
that the discrepancy between the 
results of McCroskey and those of 
Fairbanks and Guttman can be ex- 
plained along these lines. 


MEASUREMENT OF SPEECH CHANGES 


It will be clear from what has been 
said already that a variety of tech- 
niques has been used to evaluate the 
changes in speech which take place 
under DAF. In general, it may be 
said that it is not difficult to demon- 
strate changes under DAF, whether 
crude or refined indices of change are 
used. However, the stage has un- 
doubtedly now been reached where 
more refined analyses of the kind 
described by Fairbanks and Guttman 
(1958) should be employed. A few 
comments only will be made. Hanley 
and Tiffany (1954b) paired records of 
normal adults reading under no delay 
with records read either under delay 
or no delay and requested judges to 
indicate the instances in which a pair 
included a delay record. Judges 
made few errors at high intensity 
feedback levels, but misidentified 
many normal records as DAF records. 
In another experiment, Hanley et al. 
(1958) provided judges with galvanic 
skin records only and required them 
to determine whether DAF had been 
applied and, if so, at what intensity 
level. This task surprisingly was very 
successfully accomplished. 

Verzeano (1950, 1951) has de- 
scribed the use of a frequency ana- 
lyzer which records “units” of speech 
in terms of an arbitrarily determined 
pause in the flow of speech, e.g., it 
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records each time the flow of speech 
is interrupted for a period longer than 
one second. Although this technique 
has not been used to analyze speech 
under DAF, it could prove a very use- 
ful method of analysis. 

Sherman and her colleagues have 
investigated very thoroughly various 
scaling methods for estimating the 
difficulty of speech at a given mo- 
ment, and have concluded that the 
method of equal-appearing intervals 
is the most appropriate (Lewis & 
Sherman, 1951; Sherman, 1952; Sher- 
man & McDermott, 1958; Sherman & 
Moodie, 1957). Rawnsley and Harris 
(1954) used the spectrogram (a ma- 
chine which produces a visual record 
of the frequency, intensity, and dura- 
tion of any sound) to compare the 
structure of words and phrases spoken 
under DAF and normal conditions by 
the same S. They found that, if part 
of a word is repeated, the first utter- 
ance of the part resembles the struc- 
ture of the part when spoken in iso- 
lation, whereas the repetition shows a 
change towards the structure of the 
part in relation to the whole. This 
method of analysis has so far been 
used only on rare occasions to analyze 
speech changes under DAF. 

On the whole, it may be said that a 
multiplicity of techniques is available 
for accurate analysis of response 
measures. Thus far, the most detailed 
studies of the exact nature of the 
changes in speech under DAF are 
those of Fairbanks (1955) and Fair- 
banks and Guttman (1958). 


ADAPTATION to DAF 


Adaptation effects have been stud- 
ied from two aspects: the extent to 
which adaptation takes place while 
reading continues under DAF, and 
the degree to which speech returns to 
normal when delay is removed. 
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Adaptation to DAF 

Atkinson (1953) found no adapta- 
tion of either sound pressure level or 
duration when S read a total of 60 
standardized phrases. Winchester 
et al. (1959), however, used 10 200- 
syllable passages equated for diffi- 
culty and read under a delay of .16 
sec. at 60 db. No adaptation was 
found during the first two passages 
(400 syllables); but adaptation did 
take place during the remaining read- 
ing, the tenth passage being read 
about 12 sec. faster than the first (the 
problem of control for practice effects 
is discussed later). Tiffany and 
Hanley (1956) required S to read a 
45-word prose passage 12 times on 
two occasions separated by a week. 
Speed of reading showed no adapta- 
tion within or between sessions; 
fluency breaks (omissions and repeti- 
tions) showed no change within a 
series of readings, but declined signifi- 
cantly between series. The correlation 
between reading speed and fluency 
was .72 for the first series, but only.39 
for the second series. Tiffany and 
Hanley concluded that readers may 
learn to avoid the “stuttering” but do 
not overcome the change in rate, that 
is, adaptation is only partially 
achieved. Beaumont and Foss (1957) 
found a correlation of .83 between 
reading times for equivalent passages 
under DAF read at an interval of 2 
weeks. 

On the whole, then, the results of 
these studies indicate that while some 
degree of adaptation does take place 
to continued DAF, the adaptation is 
not complete. Winchester et al. 
(1959) have indicated that these 
adaptation effects may be prevented 
by increasing the feedback intensity 
or by changing the delay time, while 
Hanley et al. (1958) prevented 
adaptation by presenting DAF in- 
termittently. 
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Persistence of DAF Effects on Normal 
Speech 
The results here are fairly consist- 
ent. Tiffany and Hanley (1952) 
found no difference for two control 
readings, one taken before and one 
after under DAF. Black (1955),using 
a similar design, found that duration 
effects tended to persist into normal 
speech, but not changes in sound 
pressure level. Leith and Pronko 
(1957) found an immediate return to 
normal speech rate and level when 
DAF was removed and suggest that a 
possible source of variance here is 
whether, during the control readings, 
S knows whether or not DAF will 
again be applied. Finally, Tiffany 
and Hanley (1956), in the study 
previously referred to, reported no 
difference in mean reading time for 
12 pre- and 12 post-DAF normal 
trials. They did show, however, that 
a residual effect was present in Ss 
whose speech had been severely 
affected by DAF, whereas Ss rela- 
tively unaffected by DAF showed an 
increase in normal reading rate. In 
general, however, it is clear that most 
Ss are able to resume normal speech 
as soon as DAF is removed. 


CONFOUNDING VARIABLES 


Although the standard of research 
work in the field of DAF has been of a 
relatively high standard, andalthough 
the phenomenon naturally lends itself 
to satisfactorily designed experi- 
ments, a surprisingly large number of 
traps await the investigator. The 
control problems which arise may be 
grouped into a number of categories. 


Reading Material 


We have already outlined the main 
types of material which have been 
used. Black (1951) has constructed 
five-syllable phrases equivalent in 
mean duration and intensity values 
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under normal conditions, while Arens 
and Poppleston (1959) used a passage 
containing all English speech sounds. 
It will be obvious that if equivalent 
passages are to be used under various 
delay conditions then both their con- 
tent and structure must becontrolled, 
since different words and phrases 
have different "natural" intensity 
levels, as Black showed. Indeed, 
Kline, Guze, and Haggarty (1954), 
although using only a single case, 
suggested on the basis of their findings 
an interaction between difficulty of 
the material and the effect of delay— 
the more difficult passages showing a 
disproportionate degree of disturb- 
ance compared with the easy. Simi- 
larly, Spilka (1954b) found a signifi- 
cant interaction between length of 
reading passage and delay time for 
average syllable duration; and a 
significant main effect of passage 
length for vocal intensity and vocal 
intensity variance. 


Progressive Errors 


The use of a large number of delay 
times and intensity levels in a fac- 
torial design naturally involves the 
use of large numbers of Ss, if the 
numbers in each cell are to be of the 
order of, say five. Many authors have 
preferred to subject each S to every 
condition, but it is clearly essential in 
this case to control for progressive 
errors by appropriate designs. This 
has not always been done (eg. 
Korowbow, 1955). 


Sound Pressure Level at the Ear 


It has already been pointed out 
that different methods have been 
used to estimate sound pressure level 
attheear. It would seem to be highly 
desirable that some standard form of 
reference be adopted. Chaiklin (1959) 
has recently discussed and compared 
several different types of threshold 
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measurement. A related problem con- 
cerns whether or not the sound pres- 
sure level should be maintained con- 
stant while S is speaking or whether 
it should be allowed to vary as S 
varies the intensity of his response. 
Atkinson (1952) showed that the 
loudness of the stimulus tone affects 
the sound pressure level of S's re- 
sponse. The question is clearly an 
important one, since it has been 
shown that DAF produces an increase 
in intensity of S's response which, of 
course, raises the level at the ear, a 
vicious circle being set up. Curiously, 
the effects of controlled versus un- 
controlled sound pressure level at the 
ear have not been experimentally 
investigated. Butler and Galloway 
(1957) showed that loudness per se 
does not affect speed of reading, since 
no effect of loudness was found with 
synchronous feedback. 


Pretraining 


Little attention has been paid to 
the question of familiarizing S with 
the situation of facing complex ap- 
paratus, wearing closely-fitting head- 
phones, and so on. Tiffany and 
Hanley (1952) gave S 5 minute’s 
preacquaintance with the passage 
they were to experience subsequently 
under DAF. Atkinson (1953, 1954) 
and others have given training in 
reading Black’s standardized phrases 
while Butler and Galloway (1957) 
trained S to become familiar with the 
location of the positions on the dial 
where the numbers would appear. 


Reading Rate Instructions 


Only one study has deliberately 
varied instructions concerning read- 
ing rate. Peters (1954) instructed S 
to read at natural and at maximal 
rate on spearate occasions. In both 
cases, an increase in the rate of feed- 
back was accompanied by a faster 
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reading rate. No studies concerned 
with increased delay in feedback have 
examined this variable, which could 
be of some importance, since changes 
in reading rate accompany changes in 
delay of feedback. 


Consistency of Normal Reading Rate 


In studies on adaptation to DAF, 
it would be important to have control 
data on practice effects. Gibbons, 
Winchester, and Krebs (1958), used 
10 200-word passages of equal diffi- 
culty which were read successively 
without a break by Ss wearing head- 
phones. The rate of reading remained 
remarkably consistent throughout the 
10 passages, no effect of prolongation 
of reading time being found. 


Noisy Background 


It has been shown by Butler and 
Galloway (1957) that the effects of 
DAF are not simply due to interfer- 
ence effects of a noisy background, 
since the condition of random delay 
produced no disturbance. Winchester 
and Gibbons (1958) investigated the 
effects of a masking tone, presented 
uniaurally to one group, binaurally to 
another, at 80 db. above sensation 
level on the time to read a 500- 
syllable prose passage under no delay. 
No difference in time was found be- 
tween these groups and a group read- 
ing under no delay wearing head- 
phones but without the masking tone. 
In a study by Peters (1956), speakers 
read standardized intelligibility lists 
while simultaneously hearing various 
kinds of auditory signals, ranging 
from the same, similar but not identi- 
cal, and unrelated material to mean- 
ingful "flight-patter" phrases and 
babel. The results indicated speakers 
were more intelligible when the audi- 
tory signals were babel or words 
similar to those being read than when 
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the signals were the same or unre- 
lated words. 


Loudness Recruitment 


This phenomenon occurs in par- 
tially deaf people who may perceive 
stimuli above their hearing threshold 
as louder than do people with normal 
hearing. Thus, if partially deaf Ss 
are included in a control group under 
DAF they may show more reaction to 
DAF at high intensity levels than 
normal Ss because of loudness re- 
cruitment. The problem has been 
investigated. Harford and Jerger 
(1959) found that a group of normal 
Ss reading under DAF at various 
intensity levels above a binaural 
masking tone (which artificially pro- 
duced partial deafness) showed more 
disturbance than control Ss reading 
without masking, presumably because 
of the effects of experimentally in- 
duced loudness recruitment. 


Stimulation Deafness 


It is well known that continued 
exposure to high intensity sound 
produces partial temporary deafness. 
The effect of this in the DAF situa- 
tion would be to reduce the sound 
pressure level at the ear if S were 
reading a continuous prose passage. 
The effects of stimulation deafness in 
relation to DAF have not been in- 
vestigated. 


INDIVIDUAL DIFFERENCES 


One of the most striking features of 
DAF has been the marked individual 
differences in the degree to which .S 
can continue to speak normally under 
DAF. A few Ss show little disturb- 
ance; others are almost totally in- 
capacitated; the majority fall some- 
where between these extremes. Most 
of the work in this area has been con- 
cerned with the study of personality 
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traits and physiological concomitants 
under DAF. 


Personality Traits 


There seems to be general agree- 
ment that speakers with high verbal 
facility (Arens & Popplestone, 1959") 
or high initial intelligibility (Atkin- 
son, 1954) are less affected by DAF 
than speakers with low verbal facility 
or intelligibility. In line with this, 
Beaumont and Foss (1957) found that 
poor speakers showed greater adapta- 
tion during DAF, because the better 
speakers performed at a higher level 
throughout. 

The difficulties of relating more 
specific personality traits to reaction 
to DAF were shown in an early study 
by Spilka, Hanley, and Steer (1953). 
In the first part of the study they 
measured speaking intelligibility un- 
der conditions of high noise interfer- 
ence (without delay) and found that 
Ss most successful in overcoming 
interference were aggressive and in- 
tolerant, i.e., accustomed to overcom- 
ing obstructions by force. A replica- 
tion of the study, however, failed to 
confirm these indications. In a later 
study, Spilka (1954a) correlated vari- 
ous indices of vocal disturbance to 
a number of personality traits meas- 
ured by the California Test of Per- 
sonality (Secondary Series), Guil- 
ford’s STDCR, the total E Scale, and 
the Paranoia and Schizophrenia sub- 
tests of the MMPI. Spilka’s general 
hypothesis was that Ss who rely on 
exteroceptive (in this case, auditory) 
cues will be most affected by DAF, 
since this involves a disturbance in 
external balance, whereas Ss who 
rely mostly on proprioceptive (kin- 
esthetic) cues will be least affected. 
The underlying assumption is that all 
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knowledge rather than verbal fluency. 
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Ss rely on a combination of internal 
and external cues for monitoring 
purposes, but to differing degrees. 
From this general hypothesis, he 
derived a number of specific predic- 
tions, e.g., that individuals with nega- 
tive self-attitudes, and paranoid and 
rigid persons, will all be hypersensi- 
tive to external stimulation and hence 
will be very susceptible to DAF; 
whereas schizoid persons who depend 
largely on internal cues, will show low 
susceptibility to DAF. Spilka found 
that the voice variable most con- 
sistently related to personality vari- 
ables was change in vocal intensity 
variance. Increases in this variable 
were related to strong negative self- 
attitudes, poor personality adjust- 
ment, and paranoid tendencies; 
whereas decreases were related to 
schizoid modes of behavior. The 
relationships were all low but con- 
sistent and do provide some support 
for the general hyplothesis. Further 
support comes from a study on 
schizophrenic and normal children by 
Goldfarb and Braunstein (1958). 
Their hypothesis that schizophrenic 
children pay less attention to external 
stimuli than do normal children and 
should therefore be less affected by 
DAF is almost identical with that of 
Spilka. They rated the behavior and 
speech of 16 schizophrenic and 25 
normal children aged about 9 years. 
Under normal reading conditions, the 
speech and behavior of the schizo- 
phrenic children was significantly 
poorer than that of the normal chil- 
dren. Under DAF, however, all of the 
normal children showed gross speech 
impairment whereas the schizophrenc 
children showed very diverse results 
—from no breakdown in speech to 
severe disturbance. 

Korowbow (1955) used an 852- 
item personality test, factorially de- 
signed, and obtained correlations be- 
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tween speech disturbance and per- 
sonality traits which may be regarded 
as generally in line with those pre- 
sented already; e.g., an increase in 
intensity of vocal amplitude was as- 
sociated with “sensitivity” and “‘emo- 
tional reticence." Comparisons, how- 
ever, are not easy in the absence of 
strictly comparable personality traits, 

Beaumont and Foss (1957) found a 
positive relationship between tend- 
ency to perform poorly under DAF 
and tendency to show perseveration 
on the Luchins Einstellung test. 

It can be seen that a promising 
beginning has been made in relating 
personality variables to individual 
differences in performance under 
DAF. Whether personality variables 
can be shown to play more than a 
minor role in the explanation of these 
differences remains to be seen. 


Physiological Changes under DAF 


Doehring (1956), and Doehring 
and Harbold (1957), showed that 
under DAF at high intensity level, 
there was a significant increase in 
forearm and head muscle action 
potentials and heart rate compared 
with performance under no delay and 
in the resting state. Galvanic skin 
resistance decreased (indicating in- 
creased physiological reactivity) un- 
der DAF, while respiration showed a 
significant decrement at the end of 
each reading. A suggestion in the 
earlier study that there was a nega- 
tive correlation between amount of 
speech disturbance and amount of 
physiological disturbance was only 
partially confirmed in the later study, 
significant negative correlations being 
found between heart rate and speech 
rate and between heart rate and 
speech level. Hanley et al. (1958) 
found that at high intensity levels all 
Ss tended to show GSR disturbance, 
but that at low levels there was great 
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variability. Once again, interaction 
effects are shown to be of crucial im- 
portance. Hanley et al. also reported 
that some individuals manifesting 
severe GSR disturbance showed al- 
most no breakdown in speech, thus 
supporting to some extent the sug- 
gestion of Doehring. Further study 
is indicated. 


Developmental Aspects 


Developmental studies of suscepti- 
bility to DAF may throw much light 
on the acquisition of speech monitor- 
ing habits, but so far little work has 
been reported. Goldfarb and Braun- 
stein (1958) reported gross speech 
impairment in all normal children in 
their group, the average age being 
about 9 years. Chase, Sutton, First, 
and Zubin (1961) found that the 
speech of children aged 4-6 years was 
significantly less affected by DAF 
than was the speech of children aged 
7-9 years. Again further work is 
clearly indicated. 


DAF AND AUDITORY MALINGERING 


A great deal of interest has been 
shown in the possibility of using 
DAF to detect psychogenic deafness. 
Tiffany and Hanley (1952) showed 
that normal Ss were unable to over- 
come the effects of DAF when in- 
structed to behave as if they were 
deaf. Hanley and Tiffany (19542) 
described a case of psychogenic deaf- 
ness in which the patient had an ap- 
parent bilateral loss for pure tones of 
75-80 db. However, severe disrup- 
tion of speech occurred under DAF 
at 50 db. intensity level. Further 
tests revealed normal hearing for 
speech. 

Gibbons and Winchester (1957) 
investigated 70 Ss with medically 
diagnosed uniaural organic hearing 
losses, the threshold differential being 
at least 40 db. between ears. In one 
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test condition, DAF was presented to 
the better ear with the poorer ear 
masked; in the other condition, the 
reverse was the case, order being 
counterbalanced. It was found that 
oral reading time was significantly 
longer when the poorer ear was 
masked and the better ear subjected 
to DAF than vice versa; that is, when 
the better ear was masked, the poorer 
ear was relatively unaffected by DAF 
because of deafness in that ear. 
Gibbons and Winchester concluded 
that this technique can help to esti- 
mate the relative extent to which 
deafness is functionally or organically 
determined. Kline et al. (1954), using 
one S, showed that less speech dis- 
turbance was manifest under DAF 
when S was hypnotically deafened 
than in the waking state. Hanley 
et al. (1958) were also optimistic 
about the use of DAF to detect func- 
tional deafness, particularly in con- 
junction with changesin GSR thresh- 
olds. More recent work, while not 
denying the possible value of DAF 
for this purpose, has considerably 
qualified the earlier claims. Butler 
and Galloway (1959) point up the 
particular problem of the patient 
with mild organic hearing loss with a 
large functional overlay. The use of 
DAF is rendered difficult by the large 
individual differences in response to 
DAF by normal Ss, and by the re- 
cruitment phenomenon in hard-of- 
hearing Ss, already discussed, which 
might result in some hard-of-hearing 
patients behaving like normal Ss at 
some feedback intensity levels. They 
compared 60 hard-of-hearing Ss with 
48 controls under DAF at 50 and 
80 db. above SDT. Discrimination 
between the groups was obtained 
only at 50 db. and even at this inten- 
sity there was 30% misclassification. 
For individuals, the amount of hear- 
ing loss could not be accurately pre- 
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dicted. The pessimistic conclusions 
of Butler and Galloway are perhaps 
somewhat exaggerated, however, 
when it is realized that selection of 
the criterion groups would not be 
perfectly reliable. 

Harford and Jerger (1959) tested a 
normal control group and groups of 
patients suffering from labyrinthine 
hydrops (a form of deafness accom- 
panied by loudness recruitment) or 
bilateral otosclerosis (a form of con- 
ductive deafness without loudness 
recruitment). To control for differ- 
ences between the two clinical groups 
in ability to understand speech a 
group of normal Ss with binaural 
masking tone (producing recruitment 
but not speech discrimination loss) 
was added, while to control for age 
and deafness per se a fifth control 
group was added consisting of older 
normal Ss tested with and without 
ear plugs. These five groups were 
tested on the apparatus devised by 
Butler and Galloway (1957) at a 
delay of .167 sec. and at intensity 
levels of 10-50 db. above a spondee 
threshold. The results indicated that 
recruitment does produce an exag- 
gerated effect of DAF which cannot 
be accounted for by speech discrimi- 
nation loss (the masked normal and 
hydrops groups were like each other 
and different from the first control 
group at all sensation levels). A com- 
pletely unexpected result was the 
high error scores of the otosclerotic 
group, which differed significantly 
from the results for the fifth (normal) 
group. Some of the difficulties may 
be overcome by using a tapping test 
instead of speech. Not only does this 
overcome problems posed by bone 
conduction but Chase, Sutton, 
Fowler, Fay, and Rubin (1961) have 
shown that there is a significant 
effect of DAF on rhythmic tapping at 
feedback intensity levels as low as 10 
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db. above the threshold of hearing 
for aclick containing frequencies from 
500 to 2,000 cps. 

Thus, the usefulness of DAF as a 
test for psychogenic deafness is at 
present not clear. It is certainly 
clear, however, that loudness recruit- 
ment makes interpretation of particu- 
lar results very difficult. 


DAF As A STRESSFUL SITUATION 


Mention may be made that DAF 
has been successfully used as a form 
of stress by Pronko and Leith (1956) 
and Forney and Hughes (1961). 


DAF iN Tasks OTHER THAN 
SPEECH 


The effects of DAF have been 
shown in a number of tasks not in- 
volving speech and, indeed, in tasks 
where it would superficially appear 
as if auditory feedback would play a 
relatively small role. Thus, Kalmus, 
Denes, and Fry (1955) found that 
rhythmical hand clapping (expected 
to be primarily mediated by proprio- 
ceptive feedback) was disturbed by 
DAF. Lee (1951) and Chase et al. 
(1959) discovered disturbances of 
tapping under DAF; e.g., the key was 
tapped harder, held down longer, 
more taps given than asked for, and 
pauses between taps lengthened. In 
a recent more extensive study Chase, 
Harvey, Standfast, Rapin, and 
Sutton (1961) compared the effects 
of DAF on similar speech and tapping 
tasks (repeating [b] and tapping in 
groups of three). They found that 
similar types of errors were com- 
mitted in both types of task; but that 
the correlation between error scores 
for the two tasks was insignificant, 
indicating that the feedback monitor- 
ing systems are relatively independ- 
ent. Hanley and Tiffany (19542), 
and several others, have reported 
that no individual has even approxi- 
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mated normal whistling under DAF, 
this skill being particularly subject to 
disturbance. 

It may be noted in passing that 
very similar disturbances have been 
found in nonauditory monitoring 
tasks. Thus, Van Bergeijk and 
David (1959) delayed visual feed- 
back of handwriting while kinesthetic 
feedback remained unchanged. When 
delays of up to .50 sec. were intro- 
duced in the visual feedback, disturb- 
ances in handwriting were produced 
which matched those found in speech. 


DAF AND STAMMERING 


One of the most interesting features 
of research in this area has been the 
implications of the work for the 
understanding of stammering be- 
havior. Lee (1951), in one of the 
earliest references, suggested that 
stammerers do not stammer when 
they are members of a group because 
feedback is provided by other group 
members. This focused attention on 
the possibility that stammering may 
be related to a defect in the percep- 
tual monitoring of speech processes. 
If the monitoring of ongoing speech 
involves, as Lee (1951) suggested, a 
closed feedback loop, then any failure 
in the feedback will lead to the signal 
(a particular word unit in this case) 
being repeated until the appropriate 
information does reach the monitor- 
ing system (perhaps by summation of 
stimuli) and the speaker can proceed. 
In a series of brilliant hypotheses, 
experimentally tested by deduction 
at every point, Cherry and Sayers 
(1956) presented cogent evidence that 
stammering is associated with the 
perception of the low-frequency com- 
ponents of speech which are mainly 
bone conducted. Blocking of air- 
conducted feedback did not affect 
stammering. If, however, the stam- 
merer were completely prevented 


from hearing his own voice while 
speaking by the use of very intense 
white noise, stammering was com- 
pletely inhibited and normal speech 
resulted. Masking the high frequen- 
cies alone had no effect, whereas 
masking the low frequency compo- 
nents only had the same effect as 
white noise. These facts, of course, 
are in line with the common observa- 
tions that stammerers often do not 
stammer when they sing or whisper. 
The exact nature of the perceptual 
disability is unknown but it is inter- 
esting to note that in stammerers the 
disability is apparently related to a 
dysfunction of bone-conducted feed- 
back, whereas in normal Ss stammer- 
ing-like behavior is induced by inter- 
ference with air-conducted feedback. 
Cherry and Sayers further confirmed 
theimportance of perceptual monitor 
ing of speech by showing that the 
"shadowing" technique (in which 
the stammerer follows closely and 
aloud a passage which he does not see 
and which is read by someone else) 
not only leads to total suppression of 
the stammering, but is a valuable 
therapeutic technique. The results 
obtained by Cherry and Sayers have 
been independently confirmed by 
Shane (1955) and by Maraist and 
Hutton (1957). The latter found 
that a 90-db. masking white noise 
resulted in the stammerer's speech 
approximating normal reading speed 
and accuracy. Utilizing various 
intensities of masking, they also re- 
ported a special increment in effi- 
ciency at 50 db. 

The Cherry-Sayers hypothesis was 
further examined by Sutton and 
Chase (1961), who found no differ- 
ence in reading speed of stammerers 
under conditions involving the pres- 
entation of white noise continuously 
while S was reading; while he was 
speaking but not while he was silent; 
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or while he was silent but not while 
he was speaking. This experiment, 
however, cannot be regarded as a 
crucial test of the perceptual defect 
hypothesis, since the feedback from 
speech is heard with a slight delay 
and hence in both of the discontinu- 
ous white noise conditions, at least 
part of the feedback would be masked. 

Whether or not the speech disturb- 
ances characteristic of stuttering are 
comparable to the disturbances 
found in normal Ss under DAF has 
been investigated only by Neelly 
(1961). He found that the adapta- 
tion effect in stutterers repeatedly 
reading the same prose passage (i.e., 
the tendency for stuttering to dimin- 
ish) was quite different in degree 
and structure from that found in 
normal Ss reading under a delay of 
.14 sec. Further, listeners were able 
to distinguish speech samples from 
the two groups with a high degree of 
accuracy but could not distinguish 
the groups when both were speaking 
under DAF. Neelly (1961) concluded 
that the perceptual characteristics of 
stutterers' speech are quite different 
from those of normal Ss under DAF 
and that "an adequate account of 
stuttering behavior—or the more 
comprehensive stuttering problem— 
is not to be found in the auditory 
feedback mechanism” — (p. 78). 
Neelly's results must be viewed 
cautiously, however, since he used 
only one delay interval and his nor- 
mal Ss had presumably received much 
less practice in reading continuously 
under DAF than had the stutterers 
under normal conditions. He also 
found large individual differences in 
the reaction of stutterers to DAF. 

A little known study by Birch and 
Lee (1955) showed that speech im- 
pairment could be significantly re- 
duced in Ss suffering from expressive 
aphasia by a masking tone of 265 


cps presented binaurally. Their re- 
sults were not, however, confirmed 
in a study by Weinstein (1959). 


Discussion 


As pointed out earlier, the monitor- 
ing of speech involves the utilization 
of feedback information from three 
sources: kinesthetic and propriocep- 
tive feedback resulting from the move- 
ments involved in speaking, trans- 
mission of spoken sounds to the audi- 
tory apparatus via the bony struc- 
tures, and transmission of sound via 
the air to the auditory apparatus. 

It is clear that the disruption of 
speech which results from DAF is not 
related to the absence of any of these 
feedback mechanisms per se, though 
presumably if all forms of feedback 
were totally eliminated speech could 
not proceed. So long, however, as one 
or more of the feedback mechanisms 
is in working order, relatively normal 
speech will proceed even in the ab- 
sence of the other two. Of the three 
modes of feedback control, it seems 
likely that the kinesthetic mode is 
the least important as far as the 
phenomenon under discussion is con- 
concerned. Only minor information is 
provided by this mechanism of the 
actual nature of the sounds which 
are being produced. Furthermore, 
McCroskey (1956) eliminated, by 
anesthetization, sensory innervation 
of the lower lip and cheek, the buccal 
and lingual gingivae, the anterior 
two-thirds of the tongue, the alveolus 
and teeth, and the upper lip. While 
he found a significant decline in ac- 
curacy of articulation under these 
conditions, the anesthetization did 
not affect the rate of progress of 
speech, although DAFdid. . 

It might be supposed that in nor- 
mal speech the three kinds of feed- 
back are synchronous as to trans- 
mission time and that asynchrony 
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under DAF is the critical factor. But 
the effects of DAF do not seem to be 
completely accountable for in terms 
of an artificially produced asynchrony 
of this kind.  McCroskey (1956) 
pointed out that if asynchrony were 
the prime factor, then there should 
be a lessening of the effects of DAF 
if the kinesthetic feedback were 
eliminated or reduced by anesthetiza- 
tion. In fact, however, in his experi- 
ment, anesthetization did not lessen 
the DAF effect as far as rate of prog- 
ress of speech was concerned. Again, 
the fact that the DAF effect increases 
as a direct function of the sound pres- 
sure level argues against an asyn- 
chrony explanation, since the increase 
should progressively mask the unde- 
layed bone-conducted feedback. 
Several authors have pointed out 
that the high level of feedback is 
necessary to prevent S from counter- 
acting the airborne auditory delay by 
utilizing bone-conducted or residual- 
nondelayed auditory feedback. In 
other words, S can resist the DAF 
effect to a considerable extent, pro- 
vided he can still utilize the (now 
asynchronous) bone-conducted feed- 
back. 

There is some empirical evidence 
relating to this problem. Winchester 
and Gibbons (1957) found that 
monaural DAF without masking of 
the other ear produced less disturb- 
ance than monaural delay with mask- 
ing of the other ear. However, Chase 
and Guilfoyle (1962) presented de- 
layed and undelayed feedback simul- 
taneously to both ears. The gain 
of the latter was either one-third, 
two-thirds, or equal to that of the 
DAF. They found that while in- 
creasing the gain of the undelayed 
feedback progressively reduced the 
disturbance produced by the DAF, 
speech did not return entirely to nor- 
mal even when the gain of the unde- 
layed was equal to that of the delayed 
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feedback. Both these studies indicate 
that availability of accurate feedback 
information through one channel 
assists S in resisting the disrupting 
effect of DAF. 

These observations indicate the 
necessity for postulating some cen- 
tral controlling mechanism and the 
most obvious one to postulate is the 
existence of a comparator within the 
closed cycle feedback system. In this 
connection, the observations of Fair- 
banks (1954) are of great interest. 
Fairbanks pointed out that, in mon- 
itoring speech, any postulated mecha- 
nism must be able, not merely to 
estimate the present state of the 
system, but also to control that state 
or, in other words, to predict the 
future course of events. In his model 
of a closed cycle control system for 
speaking, Fairbanks included an ef- 
fector unit (producing the output 
from the system), a sensor unit 
(which picks up the output), and a 
controller unit. The latter comprises 
a storage unit, a comparator, and a 
mixer. The storage unit contains the 
short-term instructions for a set of 
speech units which must be displayed 
(through the effector unit) in a defi- 
nite time sequence. As each sequence 
is completed, a new set of instructions 
appears in the storage unit. The 
signal in the input at any given mo- 
ment is transmitted both to the 
effector unit and to the comparator 
and mixer. The feedback signals 
from the effector unit to the compa- 
rator are compared with the input 
information contained there and any 
discrepancy between the signals (the 
error signal) is relayed to the mixer 
unit. This latter unit combines the 
input signal and the error signal in 
such a way as, eventually, to reduce 
the difference to zero. At this point 
the system is in equilibrium. Within 
the comparator, however, is con- 
tained a predicting device which 
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continuously predicts (presumably on 
the basis of past experience) the fu- 
ture point at which the error signal 
will be zero. Thus, the input may be 
changing even while the effector unit 
has not yet completed the transmis- 
sion of the current unit. With a 
system of this kind, it is possible to 
predict what will happen if part of 
the system does not function prop- 
erly. Thus, if, as happens in DAF, 
the transmission of information from 
the effector unit through the sensor is 
delayed, the comparator will trans- 
mit an error signal to the mixer and 
the signal may be repeated, or the 
whole system may halt until the 
effector signals are transmitted back. 
It has been suggested (Stromsta, 
1959) that the locus of the compa- 
rator may be the cerebellum. 

It has, of course, been disputed 
whether serial skills such as speech, 
tracking behavior, etc., can be con- 
tinuously monitored in this way, 
when regard is had to the speed of 
neural conduction from the effectors 
to the brain. An alternative formula- 
tion would argue that the storage 
unit described by Fairbanks would 
trigger off a set of speech units which 
would then proceed without further 
monitoring unless some serious break- 
down occurred. The effects of DAF 
would represent one such example of 
a breakdown. Chase, Rapin, Gilden, 
Sutton, and Guilfoyle (1961) pointed 
out that delayed auditory feedback 
is in a sense a misnomer, since the 
phenomenon does not really refer toa 
change in normal feedback, but to a 
delayed auditory event which is actu- 
ally foreign to the normal state of 
affairs. In their study, they were 
able to show that if S were prevented 
from watching his tapping (decreased 
visual feedback) and at the same time 
an unrelated visual stimulus were 
presented just after a tap, disorgani- 
zation of the tapping was produced. 
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It may be noted in this connection 
that stammering can be completely 
inhibited if S and the experimenter 
read a passage of prose simultane- 
ously, even though the experimenter 
is reading quite different material 
to S, or is reading gibberish. On the 
other hand, Gibbs (1954), who pro- 
vided a careful evaluation of the 
literature relating to feedback con- 
trol, concluded, on the basis of his 
experiments, that continuous mon- 
itoring does appear to be feasible, 
while Stromsta (1959) considers that 
neural transmission times are com- 
patible with the hypothesis. The 
explanations are not, in fact, incom- 
patible with each other, but the 
problem of how they interact remains 
to be solved. 

The model put forward by Fair- 
banks works, of course, because it was 
constructed to parallel the observed 
facts, While this does not lessen its 
value, it does not reduce the neces- 
sity for considering other possible 
explanatory theories nor the neces- 
sity for careful further experimenta- 
tion of the kind carried out by Chase 
(1958). He argued that if DAF 
facilitates the circulation and re- 
circulation of speech units in the 
speech-auditory feedback loop, then 
it should be possible to repeat a single 
speech sound more often in unit time 
under delay than under normal con- 
ditions. In his experiment, one group 
repeated the sound [b] as quickly as 
possible for 5 sec. under synchronous 
(i.e., faster than normal) feedback; 
and then repeated the sound with a 
feedback delay of .216 sec. A control 
group was tested twice under syn- 
chronous delay. Seventy-five percent 
of the experimental group showed a 
faster rate of repetition under de- 
lay. i Y 

A great deal more experimentation 
is still needed to explore the relation- 
ships between the three types of feed- 
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back and their disruption. It might 
be expected, for instance, that similar 
disruptive effects on speech would be 
produced if bone-conducted feedback 
were delayed, with air-conducted 
feedback blotted out. In this connec- 
tion, Siegenthaler and  Brubaker 
(1957) have made many valuable 
suggestions as to future lines of re- 
search, many aspects of which have 
scarcely as yet been touched. Their 
suggestions fall into three categories. 
In relation to the speaker, they men- 
tion individual differences in relation 
to intelligence, reaction time, reading 
ability; amount or type of speech 
disturbance in relation to frustration 
tolerance, personality traits, hearing 
loss. In relation to speech output, 
they mention the effect of DAF on an 
acquired as opposed to a native 
language; the effect of DAF on read- 
ing passages of various consonant 
/vowel structure; and its effect on 
passages of differing levels of diffi- 
culty. Finally, in relation to modifi- 
cations of the apparatus, they men- 
tion the use of separate microphones 
for each ear with DAF presented 
separately to each ear, but with 
different delay times. In this connec- 
tion, it may be mentioned that, since 
1950, at least 50 higher degree theses 
have been written on DAF, many of 
them dealing with important aspects 
not covered in the published litera- 
ture. Yet only a small proportion of 
these theses has been published. 
There can be no question but that 
the technique of DAF provides a 
most useful method of investigating 
the role of feedback mechanisms in 
the control of skilled response pat- 
terns and, as such, deserves, and 
requires, more attention than it has 
so far received, especially since it is 
clear that the technique is readily 
applicable to skills other than those 
involved in speech. 
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The 1950-61 literature is covered and is organized around the test de- 
signs, the various scoring systems, and the diagnostic interpretations. 
When scored objectively its validity in determining MAs for children, 
and as an additional tool in a test battery aimed at differential diagnoses, 
is acceptable. Symbolic interpretation remains highly subjective. The 
lack of standardization mitigates against utilizing the test as a norm 
against which to judge other variables. 


The Bender Gestalt Test (BG) 
was built on the premise that ac- 
curate visual-motor behavior is a 
skilled act. Its nine geometrical de- 
signs are composed of dots, lines, 
angles, and curves combined in a 
variety of relationships. Individuals 
see and reproduce these geometrical 
designs differently. It is believed by 
those utilizing this tool, that there is 
a “normalcy range" in the manner of 
reproducing the designs that is highly 
correlated with the hypothetical aver- 
age person. They further assume that 
deviations from the normal range re- 
flect deviations from the average in- 
dividual in intellectual capacity and 
functioning, emotional stability, per- 
ceptual relevancy, attitudinal logical- 
ity, need gratification patterns, ade- 
quacy of defense mechanisms, and 
soundness of brain tissues and chem- 
istry. 

The reader may recall that even 
before 1932 Bender began utilizing 
Wertheimer's designs as a means of 
tapping such processes as intelligence 
in children and the mentally defec- 
tive, schizophrenic reactions, and the 
organic brain disturbances correlated 
with trauma or toxic agents. Since 
then, she has published two manuals 


1 Nowat Veterans Administration Hospital, 
New Orleans, Louisiana. 


that have served as standards (Ben- 
der, 1938, 1946). Bell (1948), Billings- 
lea (1948),? Buros (1949, 1953), Pas- 
cal and Suttell (1951), Peek and 
Quast (1951), and Woltmann (1950) 
published manuals describing their 
own scoring and interpretation system 
or giving textbook accounts of the 
method. In them will be found an 
adequate bibliography of the field 
through 1950. Hutt (1953) repub- 
lished his 1945 restricted Army 
manual with some modifications, but 
still omitted supporting research 
data. Gobetz (1953) was the last to 
publish a manual-type monograph. 
In 1952 the attitudes towards this 
tool were so uncrystallized, that 
Burton, writing in Buros (1953) re- 
ported: 

Watching an expert, trained at one of the 
military installations interpreting a Bender 
was to witness the crudest sort of crystal gaz- 
ing. It was particularly dismaying to have to 
come to terms with the fact that this kind of 
operation was taking place in many univer- 
sity clinical facilities, those presumed citadels 
of the new dynamo-scientific clinical psy- 

:In my monograph a Mason General 
Hospital Bender Gestalt guide is listed as 
having an "anonymous" author. By personal 
communication Edith E. Lord reported hav- 
ing authored it. 

3 Authors recently entering the field fre- 
quently use the Army reference for Hutt. 
It was "restricted" material, but his 1953 pub- 
lication is readily available. 
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chology. Reaction on the part of competent 
[see Footnote 4] clinical psychologists was 
swift and also somewhat overgeneralized. The 
scorn, which was justifiably poured on the 
crystal gazing, extended to the instrument 
itself (p. 287). 


Many studies have been published 
yearly on specialized aspects of this 
test. It seems, therefore, that an up- 
to-date survey of the literature 
should permit a re-evaluation of this 
clinical procedure and its original 
assumptions, and thus help set the 
foundations for its use today. 
Though this report attempts to be 
inclusive, it cannot be complete. 
Some literature was not readily 
available, some was hidden under un- 
revealing titles, and some may have 
been overlooked unintentionally. 


FiGURES 


No outstanding changes in the 
original designs have been made that 
have been accepted. Barkley (1949) 
developed a set of raised plastic plates 
of the designs, which he had prophe- 
sied would assist in discrimination of 
intracranial organic pathology. No 
one else has published results from 
using his plates. In 1948, I urged 
acceptance of one standard set of 
figures, yet Popplestone (1956) was 
forced to lament the many variations 
apparent in the seven published ver- 
sions of the BG figures he surveyed. 
Subjective methods of evaluation 
(Bender, 1938; Hutt, 1953) and even 
the relatively objective scoring sys- 
tems (Koppitz, 1958a; Pascal & Sut- 
tell, 1951) utilize the stimulus design 
as the standard referent against 
which to judge the experimental sub- 
ject’s or the patient’s reproduction 
protocol. Willingness to disregard 


4 Burton seems to be implying, even then, 
acceptance of the “standardized” clinician 
concept advocated later by Hunt (1959) and 
Meehl (1954, 1956, 1957). 


such variations in stimulus designs 
assumes, at best, that as long as the 
essential gestalt of the individual 
designs is maintained from set to set, 
the requirements of all significant 
perceptual principles have been met. 
I continue to believe such an assump- 
tion is untenable. If that opinion is 
sound, it follows that psychologists, 
functioning as clinicians, warrant 
some of the embarrassing criticism 
received from experimental col- 
leagues. It seems feasible to establish 
procedures permitting the adoption 
of a standard set of stimuli with 
minimal difficulty. 


SCORING SYSTEMS: CHILDREN AND 
EARLY ADOLESCENTS 


Bender (1938, 1946) incorporated a 
table of responses for each year of age 
from ages 3 through 11 to “adults.” 
The examiner compares his subjects' 
protocols with these responses, and 
thus obtains a suggested mental age 
(MA) level. Applying her approach 
to 100 hospitalized youths aged 3-20, 
two Frenchmen, Hewyer and Angoul- 
vent (1949), found their obtained 
BG MAs equal to or better than 
those obtained from Binet-Simon of 
Stanford-Terman batteries. Then 
Wolfsohn (1951-52), faced with a 
flood of children immigrating into 
Israel, combined the BG with Patter- 
son Form Boards and Goodenough 
scoring of Draw-A-Man Test. With 
children 6-11 years he found a corre- 
lation of .73 with the Goodenough 
MAs and he expressed the belief that 
the BG Test was apparently culture 
free. 

Modifications in Bender’s proce- 
dure were first reported by Keller 
(1955), who employed “apriori judg- 
ment" methods to develop a scoring 
system involving maturation levels 
for use with mentally handicapped 
children. Though no full account of 
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the system seems to have been pub- 
lished, an observer of the procedure 
reported it to be “simple and fast.” 
Apparently 114 signs selected by 
Keller are employed on all nine fig- 
ures and then the sum of the “passes” 
on these signs is taken as a "total 
score." The signs related to each fig- 
ure were roughly assembled in order 
of increasing difficulty. He reports a 
12-month retest reliability of .89 on 
36 boys. Correlations with the Binet 
and the Grace Arthur ranged from .63 
to .77 on 37 boys, age range 7-11. On 
a 13-year-old group of boys with a 
higher average IQ, similar correla- 
tions ranged from .43 to .67. It is 
interesting that Keller found the 
correlations between his BG total 
scores and the Reading and Arith- 
metic achievement test scores to be 
higher than the correlations between 
the Grace Arthur and these tests, but 
lower than the correlations between 
the Binet and these same tests. The 
Pascal and Suttell (P&S) scoring 
system was applied by Baroff (1957) 
to the protocols of 84 mentally de- 
ficient twins. He felt his results dis- 
criminated successfully seven differ- 
ent MA levels and also validated 
certain signs suggestive of organic 
brain damage. He believed that 
norms in the form of scores plus 
qualitative signs could be established 
for his seven MA ranges, which 
would discriminate the endogenous 
mentally deficient individual. Only 4 
of 15 signs selected by Byrd (1956) to 
score the BG protocols of 200 mal- 
adjusted youths 8-16 years old and 
those of a group of “normals” tended 
to discriminate the two groups. The 
signs were: orderly sequence, change 
in curvature, closure difficulties, and 
rotation. The abstract of Wewetzer's 
(1956) paper reports that the partial 
scores of his scoring method for chil- 
dren's BG protocols will "satisfac- 
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torily differentiate" brain damaged 
children from normals. ^ Koppitz 
(1958a, 1960a) in 1958 hypothesized 
that if the BG Test discriminated a 
group of children with low achieve- 
ments in reading, writing, and spell- 
ing from a similar group with above 
average achievements in those skills, 
the test would be discovering learning 
disturbances primarily due to prob- 
lems in visual-motor perception. She 
modified P&S's scoring system to 
seven main factors, which were re- 
lated to distortions in Figures A, 3, 5, 
and 7. The sum of the indices for the 
seven factors gave a total score. She 
applied her procedure to an initial 
group of 41 above average and 36 
below average achievers from the 
first to fourth grades, with an age 
range of 6-10 years. Then Koppitz 
(1958) cross-validated her results ona 
group of 31 above average and 20 
below average children of comparable 
age and education. This second 
group had the additional character- 
istic of being maladjusted, She re- 
ported a significant discrimination 
between the two groups. In 1960 
Koppitz published the results of a 
normative study on her scoring sys- 
tem which she applied to the proto- 
cols of 1,055 kindergarten through 
fourth grade children having a range 
of 5-0-10-5 years. In this article she 
presented the normative scores ina 
table containing 11 6-month groups 
throughout this age range. Likewise, 
Koppitz gave a table of mean scores 
for the five grade school levels. She 
reports the system continues to dis- 
criminate significantly between her 
groups. 

The stability of the scoring systems 
used on this 3-12 year age range for 
establishing MAs and giving clues to 
the presence of brain damage con- 
tinues to show promise of rewards 
from further explorations. Most of 
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the above studies, of course, only 
report significant group differences. 
The finding that the Binet was more 
efficient in predicting individual cases 
than was the BG score points up the 
caution that is necessary when one 
interprets the individual subject's 
BG protocol Perhaps research re- 
sults will be forthcoming which will 
more clearly define the limits of the 
test when the individual child's 
protocol is interpreted. 


SCORING Systems: LATE ADO- 
LESCENCE AND ADULTS 


Inspection Systems 


This approach was introduced by 
Bender (1938, 1946). The protocols, 
her minute observations of them, and 
the interpretive manner in which she 
combines the observations to fit 
diagnostic categories, are still stimu- 
lating after many rereadings. Sulli- 
van and Welch (1948) used her sys- 
tem to score the protocols of 101 ex- 
perimental subjects and of a matched 
control group. The experimental 
group had a 6-17 year age range, 
included both sexes, and all had a 
history of poliomyelitis. The matched 
control group had not had "polio." 
The authors divided the two groups 
into two subgroups. One pair of the 
subgroups, experimental and control, 
received the Stanford-Binet, the Cali- 
fornia Test of Personality, the Hunt- 
Minnesota, and the BG. 'The other 
pair did not receive the BG or Hunt- 
Minnesota. The authors reported a 
significant discrimination by the BG 
of the experimental group from the 
control group. A frequently men- 
tioned Military Clinical Psychology 
Technical Manual (Department of 
the Army, 1951) simply paraphrases 
Bender's procedure and then recom- 
mends that it be used and interpreted 

by an experienced clinician. 
Hutt (1953) modified Bender’s 
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LJ 
inspection procedure, distinctly. He 
defines 27 “scoring” factors, de- 
scriptively. For the first three factors 
and the fifth, he gives his range for 
normals, but does not list similar 
suggestions for the rest of the factors. 
One must suppose, therefore, that his 
normals accurately reproduced the 
stimulus figure. One or more psycho- 
dynamic or diagnostic grouping char- 
acteristics are attributed to each 
factor. Hutt frequently refers to 
"our data" and "our findings," but 
these are not disclosed in his publica- 
tion and they do not seem to have 
been published elsewhere. Though 
herejects the term "syndrome," Hutt 
continues to list the groupings of his 
factors which he feels discriminate 
between selected psychiatric classifi- 
cations. These groupings he calls 
"patterns." 'The system was used by 
Harriman and Harriman (1950) on 
the records of 30 5-year-old nursery 
school children who had not com- 
menced to read, and on 30 second 
grade 7-year-old children who were 
judged to be making satisfactory 
progress in reading. They used 11 of 
Hutt's factors and tabulated the 
percentage of each group whose pro- 
tocols appeared to contain positive 
aspects of the 11 facrors. No sta- 
tistical procedures were employed, 
but the authors judged that 4 of thet1 
factors showed "sign differences" in 
favor of the second grade group. The 
authors interpreted their results as 
showing that sensory-perceptual- 
motor activities of 7-year-old children 
more closely resemble those of adults 
than do those of nursery school chil- 
dren. They feel this characteristic is 
a necessary correlate of reading readi- 
ness. Their study was repeated by 
Baldwin (1950) on two Negro ado- 
lescent sisters. One sister's BG 
protocol supported Harriman’s find- 
ings. The similar protocol of the other 
gave findings that were just opposite 
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to Harriman'sresults. Hanvik (1951) 
employed Hutt's inspection proce- 
dure on regular and recall BG proto- 
cols obtained from two groups of 
adult male veterans, one composed of 
patients in a Veterans Ad ministration 
neuropsychiatric (NP) hospital for 
treatment of functional complaints. 
Using 19 of the factors, he gained a 
total score by tallying counts where 
any factor appeared in a protocol. 
Secondly, he had experienced judges 
sort the protocols on the basis of ex- 
perimental or nonexperimental group 
membership. Finally, he scored the 
recall protocols for the number of 
total designs remembered. Neither 
of the three methods discriminated 
the two groups significantly. 'The 
rapid inspection system was also used 
by Matchabely and Bertrand (1953) 
on the BG protocols of 82 hospital- 
ized NP patients in a French hos- 
pital. The authors reported definite 
BG factor patterns for schizophre- 
nia, paranoia, alcoholism, melancho- 
lia, neurosis, epilepsy, oligophrenia, 
dementia, paralytica, hysteria, and 
psychopathic personality. 

The “creatively interpretive” clini- 
cian feels fettered by the demands of 
experimental procedural controls. He 
prefers to inspect the results of his 
clinical examinations, searching for 
hunches and apparent trends in these 
data. Many well designed, but sterile, 
pieces of research bear mute testi- 
mony to this point. An increasing 
number of statistical methods that 
objectify the data without destroying 
clinically apparent trends are avail- 
able. Their use should help to allevi- 
ate much of the reticence now ap- 
parent and improve the quality of 
the publications, 


Objective Scoring Systems 


Billingslea’s (1948) monograph was 
the first published attempt at an 
objective scoring system. Following 
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Bender's and Hutt's lead, he selected 
and defined 38 factors, using 137 
indices to score them. Measurements 
were made in an objective fashion. 
He used this system on the protocols 
of 100 neurotic adult male patients in 
an Army hospital and 50 adult male 
soldiers judged to fall within the nor- 
mal range of emotional behavior. He 
was unable for the most part to dem- 
onstrate interfigure reliability for the 
test factors. Likewise, Hutt's pat- 
tern for the psychoneurotic record 
was not found to be valid. "Though 
the monograph seemed to stimulate 
much research activity following its 
publication, the scoring method has 
proven too cumbersome for clinical 
and research use. A scoring system 
employing graph paper to determine 
objectively the size factor of the BG 
figures was reported by Kitay (1950). 
He used this method to obtain data 
for 25 indices, which in turn led to a D 
score. He expected D to designate an 
overall expansion or contraction of 
size and reported it did so when he 
applied it to the protocols of 60 nor- 
mal college undergraduates. 

A compromise was adopted by 
Pascal and Suttell (1951). They 
settled on 105 factors and provided 
actual BG records to assist the 
scorer in determining whether the 
factor is present ina particular proto- 
col. Each factor is given a numerical 
value. The sum of. the numerical 
values of the factors judged to be 
present in a protocol represents a 
“total raw score.” A helpful scoring 
blank is provided. The authors used 
their scoring system on 271 protocols 
from adults with a high school back- 
ground and 203 protocols from adults 
with college background. The raw 
scores of each of those distributions 
translated into standard scores per- 
mitted the establishment of Z or 
weighted scores. Applying their 
scoring system to the protocols of a 
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group of psychotic patients and a 
group of neurotic patients, they 
found significant differences between 
the groups with psychopathology as 
well as between these groups and the 
nonpatient normals. They indicated 
the results were not adequate, how- 
ever, for interpreting an individual 
protocol. The ability of this system 
to discriminate groups of patients 
displaying different types of psycho- 
pathology from nonpatient groups 
has been supported by the findings of 
Addington (1952), Swenson and 
Pascal (1953), Curnutt (1953), Robin- 
son (1953), and Lonstein (1954). 
Another series of studies, on the other 
hand, are less favorable. Curnutt 
and Lewis (1954) obtained BG pro- 
tocols and Rorschachs from 25 hos- 
pitalized NP patients. They hy- 
pothesized that the BG Z score and 
| the F+ percentage of the Rorschach 
were theoretically similar indices, and 
therefore, should show a relatively 
close relationship. Their data dis- 
closed no correlation between the two 
distributions. Likewise, Blum and 
Nims (1953) found that the P&S 
system failed to differentiate signifi- 
cantly the BG protocols of a group of 
NP patients from a control group 
which was instructed to simulate 
neuropsychiatric illness. The au- 
thors also applied a clinical matching 
procedure to the same protocols, 
which significantly did differentiate 
the experimental from the control 
group. Incidentally, the authors 
found no significant difference be- 
tween results gained from administer- 
ing the BG by group procedures or 
by individual testing. Negative re- 
sults were reported by Tamkin (1957) 
in an attempt to discriminate psy- 
chotic and nonpsychotic male veteran 
NP patients, and by Tucker and 
Spielberg (1958), who sought to 
separate ‘‘depressed” patients from 
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those with other NP disturbances. 
Rosenthal and Imber (1955) de- 
scribed a somewhat different method 
of evaluating the P&S scoring system. 
The subjects were 13 NP outpatient 
cases variously diagnosed as some 
type of neurosis to some type of 
schizophrenia ina significant degree of 
remission. These subjects were tested 
in a before, during, and after type of 
experimental design. The design 
covered five periods of 2 weeks each. 
Psychiatrists saw the subjects once 
each period and rated their observa- 
tions on a check list. The psycholo- 
gist administered the BG Test once 
each period. Between these periods, 
Mepheneson or a placebo was given 
in a prescribed dosage to the subjects 
by the psychiatrist in a random, blind 
manner. Statistical interpretation of 
the data revealed that the check list 
did not disclose any clinical improve- 
ment in the patients. Likewise, there 
was no correlation between P&S 
scores on the BG protocols and times 
when Mepheneson was administered 
versus times when placebos were 
administered. Of added importance 
in this study was the impression of 
these authors that some changes they 
noted in the BG protocol could be 
attributed to practice effects on the 
test. Keehn (1957) examined the 
question of repeated testing by com- 
paring series of BG protocols ob- 
tained from four chronic schizo- 
phrenic patients. He tested his sub- 
jects from 13 to 15 times on the BG 
with 4-day intervals between testing. 
The author found no systematic 
trends in the P&S scores for all of the 
patients, This was true of the total 
scores, as well as the scores on the 
individual figures. Two of the sub- 
jects’ protocols showed "improve- 
ment” on all figures during the testing 
period, whereas two others tended to 
regress somewhat. The scores for all 
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subjects fluctuated markedly from 
individual testing to individual test- 
ing. The author also administered 
the Koh block designs from the 
Wechsler-Bellevue Intelligence Scale, 
Form II (W-B II) battery, to these 
four subjects at each administration 
of the BG Test. Although variations 
also appeared in the subjects' scores 
on that measure, the final results 
displayed definite improvement 
through improved accuracy. It 
seems obvious that though this scor- 
ing system has many limitations, its 
assets probably outweigh the limita- 
tions. Further support for this con- 
clusion is found in the fact that the 
literature reveals it to be the most 
widely used scoring system on the 
BG Test today. 

Peek and Quast (1951) published 
an objective scoring system which 
was not available to this reviewer, nor 
could a description of it be found 
elsewhere in the literature. Peek 
(1953) does report a study of the 
relationships between one BG Test 
factor and personality character- 
istics. It is not clear whether this 
factor was included in the author's 
original scoring system. The factor 
relates to Figure 5 and depends on 
whether the subject in drawing the 
dotted directional line of that figure 
starts from the rim of the cup of the 
figure or starts from the outer edge of 
the line and approaches the cup ac- 
cordingly. Seemingly, the “combina- 
tion" of personality characteristics 
of 75 hospitalized male NP patients 
who started the diagonal at its upper 
edge is significantly different from the 
personality characteristics of asimilar 
group whose directionality of line 
pattern was unknown. Goodstein, 
Spielberger, Williams, and Dahlstrom 
(1955) were concerned whether the 
method of “free recall" incorporated 
in the Peek and Quast scoring system 
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might be affected unduly by the 
serial position of the figures and their 
varying levels of difficulty of recall. 
Using college students and an ade- 
quate experimental design, they con- 
cluded that BG Figures A, 1, and 2 
are the easiest to recall, while Designs 
3, 4, and 7 are most difficult. They 
also found that recall adequacy is a 
function of the serial position of the 
designs, as well as their difficulty 
level. Stewart (1957) noting their 
findings on Figures 3 and 4 wondered 
if, since these figures were located 
essentially in the middle of the se- 
quence, the findings might also be a 
function of the learning principle that 
"ordinarily items toward the begin- 
ning and end of a series are easier to 
learn than those in the center." His 
preliminary results appeared to sup- 
port that learning principle, though 
they do not, necessarily, negate the 
previous study's results. 

Japanese psychotic adults have 
been discriminated by Okino's (1956) 
121 indices scoring system,. but 
Uffelmann (1958) failed to differenti- 
ate Canadian schizophrenic adults 
from hospital employees with a sys- 
tem he developed. 

Comparisons of the effectiveness of 
the inspection system versus the P&S 
scoring method have been made. 
Bowland and Deabler (1956) found 
both approaches successfully sepa- 
rated the BG protocols from four 
similar adult male groups judged to 
be nonneuropsychiatric patients, neu- 
rotic patients, schizophrenic patients, 
and patients with organic brain 
damage. Mehlman and Vatovec 
(1956) submitted 25 protocols of 
carefully matched patients to three 
nationally known BG “‘experts” who 
employed their particular evaluatory 
approaches. One patient of the pair 
was judged to have a “functional” 
psychosis while the other had an 
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“organic” psychosis. Only two of the 
judges approached better than chance 
separation of the groups. The au- 
thors report difficulties in obtaining 
the participation of five other ex- 
perts. I have no comment. Later 
Nadler, Fink, Shontz, and Brink 
(1959) described a well designed 
study aimed at experimentally com- 
paring the efficiency of P&S's scoring 
system on BG protocols with that of 
clinical inspections. They set the task 
of discriminating the protocols of 27 
patients with known organic brain 
damage from those of 26 patients 
judged to be without such damage. 
Both methods were reported to be 
equally successful. They then re- 
moved the 28 cases from the extremes 
of their distributions on organicity 
and found that neither method could 
discriminate the two groups signifi- 
cantly. The judges, by the way, were 
two psychologists familiar with the 
P&S system, two psychologists who 
were not familiar with that system, 
and two occupational therapists not 
trained in psychology. The reliability 
between the judgments of members of 
pairs of judges was high and the reli- 
ability between judges generally was 
almost as high. 
McPherson and Pepin (1955) asked 
whether the scores obtained on the 
same subject from his BG protocols 
from two different motor methods of 
reproducing the figures would be the 
same. They argued that the scores 
from the two methods must be essen- 
tially the same if they were to be 
interpreted diagnostically. Their two 
methods involved: the usual admin- 
istration procedure of the BG, and 
having the subject construct the 
figures by placing pieces of felt on a 
felt board. They used 32 male and 
female senior college students as sub- 
jects. The results were rated on the 
degree of similarity between the 
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stimulus figure and the reproduced 
figure. They found an acceptable 
agreement between the two methods 
of reproduction 77% of the time, 
with a “total or extreme disagreement 
occurring only 6% of the time." 
They concluded that BG figure 
reproductions were more influenced 
by the perceptual factors than by 
motor factors. 

The preceding survey seems to 
force the conclusion that both 
Bender's inspection system and P&S's 
objective scoring system have stood 
the test of time, when the problem is 
to separate grossly the BG protocols 
reflecting major disturbances from 
those reflecting normal behavior. 
Neither approach adequately handles 
diagnostic evaluation of the individ- 
ual case. The quality of the “stand- 
ardization” of the clinician appears 
to be the important variable under 
these conditions. The “categorizing” 
approach was losing popularity in 
1947, and now seems to be dying 
normally. Its place is being taken by 
areturn to discriminatory evaluations 
of mental processes such as percep- 
tion, problem solving or thinking, 
attitudes, need gratification, etc. 


SPECIALIZED INTERPRETATIVE 
APPROACHES TO THE BENDER 
GESTALT 


Guertin (1952, 1954a, 1954b, 1955) 
set himself the problem of seeking 
“some systematic order which could 
be developed out of the relationships 
among the scoring factors for the BG 
by the application of factor analysis.” 
Initially, he obtained the BG proto- 
cols from 100 hospitalized NP pa- 
tients, psychiatrically classified as 
either psychotic or with organic 
brain pathology. His scoring proce- 
dure essentially followed Billingslea’s 
definitions and included indices for 41 
variables. A factor analysis of the 
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intercorrelation matrix on these vari- 
ables yielded five factors which he 
labeled: (a) propensity for curvilinear 
movement, (b) pure spatial contigu- 
ity, (c) constriction, (d) careless 
execution, and (e) poor reality con- 
tact. Next Guertin interested himself 
in the characteristics of the factor 
“propensity for curvilinear move- 
ment” discovered in his first study. 
He noted that it was most heavily 
loaded in catatonic and mixed schizo- 
phrenic subjects. From that he hy- 
pothesized that poor emotional con- 
trol might be basic to “certain types 
of distortion” of the designs, and in 
turn, might be an important variable 
in schizophrenia. Again he obtained 
the BG protocols from 100 hospital- 
ized NP patients, 26 of whom were 
females. Diagnostically the group 
included 13 nonschizophrenics, 15 
catatonic schizophrenics, 14 hebe- 
phrenic schizophrenics, and 35 mixed 
schizophrenics. Five factors were 
revealed by an analysis of 42 signs: 
(a) unstable closure, (b) curvilinear 
distortion, (c) propensity for curvi- 
linear movement Factor II, (d) frag- 
mentation, and (e) irregular propen- 
sity for curvilinear movement. Fac- 
tor a seemed related to an underlying 
general instability. Factor b appeared 
related to impulsiveness with a pos- 
sible emotional basis. Factor ¢ was 
related to emotional disorganization 
and display of affect. Factor d seemed 
related to either misperception or 
attempts to avoid unpleasant feelings 
through use of associations. Factor e 
was related to emotional conflicts and 
neurotic-type defenses against un- 
pleasant feelings. Still seeking for a 
BG diagnostic formulation of schizo- 
phrenia in general, he next performed 
a transposed factor analysis on the 
BG protocols of 32 hospitalized 
male, chronic, long-term schizo- 
phrenic patients with a variety of 
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subtype diagnoses. No large group 
factor corresponding to schizophrenia 
emerged. On the other hand, the 
study did tend to group his subjects 
into four subtypes that he labeled: 
chronic undifferentiated schizo- 
phrenia, disorganized schizophrenia, 
conforming and nondefensive schizo- 
phrenia, and activelydefensive schizo- 
phrenia. Lastly Guertin (1955) did a 
transposed factor analysis of the BG 
protocols of 30 male hospitalized NP 
subjects with recent onsets of a 
paranoid type schizophrenic reaction. 
This study also disclosed four sub- 
types for this diagnostic grouping 
which he labeled: chronic and deteri- 
orated, hostile reactive, poorly inte- 
grated, and inadequate and with- 
drawn. The writer does not know the 
degree of impact Guertin’s studies 
have had on the general use of the 
BG. His work has been given an ex- 
tended review here because it reflects 
a statistically synthesized organiza- 
tion of the many BG indices and 
factors used to discriminate the pro- 
tocols of patients with schizophrenic 
reactions. 

A truly amazing number and vari- 
ety of symbolic interpretations have 
been produced by clinical psycholo- 
gists for BG protocols and made 
available to some of their colleagues 
on a private basis. Only three studies* 
have been concerned with this prob- 
lem, however. Suczek and Klopfer 
(1952) sought to introduce a degree of 
realism in this activity by obtaining 
the free associations to each of the 
nine BG figures of 48 beginning 
psychology college students who were 


5$ Three additional studies (Guertin & 
Davis, 1962 personal communication; Ham- 
mer, 1954; Tolor, 1960) have been done. 
Hammer's small but positive results were 
contaminated by questionable testing pro- 
cedure, and the other two studies produced 
negative findings. 
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viewing the projected designs in a 
group setting. Their free associations 
to each of the figures could be 
grouped under five headings: a list of 
objects frequently associated with 
each, the particular part of the figure 
upon which average interest focused, 
the affective pull which the figure had 
upon the average viewer, the average 
symbolic meaning of the figure for the 
standard group, and the experi- 
menters' tentative interpretation of 
the figure as a symbol. Although the 
reader may not agree with those ten- 
tative interpretations, he at least has 
objective data to support the other 
four groupings. Tolor (1957) follow- 
ing Suczek's and Klopfer's lead, had 
50 Air Force NP patients give their 
associations to the nine BG designs. 
He found the individual designs 
differed decidedly in their stimulus 
value for the subjects. There were 
many rejections, many vague associa- 
tions, and many simple descriptions. 
The intellectual and educational back- 
grounds of the subjects were not de- 
scribed, however. He urged caution 
when interpreting a BG protocol 
symbolically. The child subjects of 
Greenbaum (1955) were asked to state 
what each BG design reminded him 
of after he had reproduced all figures. 
Some of the nouns from each sub- 
ject's statements then were selected 
and mixed in a commonly used word 
association list. This modified list was 
administered to the subject on two 
different follow-up occasions. The 
subject's associations to the two word 
lists were subjectively interpreted 
and so cannot be evaluated by this 
reviewer. It is probable that sym- 
bolic interpretations of the BG 
protocol will continue to be popular. 
Little support for their apriori valida- 
tion is given by the preceding re- 
search, however. Empirical hunches 
in this area should be put to scientific 
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tests or their dissemination should be 
prevented. 

The clinician dealing with patients 
with varied cultural backgrounds 
must be alert for test stimuli that are 
influenced in a minimum manner by 
differing cultural milieus. Peixotto 
(1954) was faced with this problem 
in a Hawaiian clinic and wondered if 
the BG would prove to be a culture- 
free instrument, She selected 35 NP 
patients active in the outpatient 
clinic who represented seven different 
ethnic groups on the islands. Age 
range was 14-31 and their IQ range 
was 82-135. The BG protocols were 
scored according to the P&S method 
and an analysis of variance of the raw 
Scores was made with the expectation 
of finding no important differences 
between ethnic groups. Variation 
between ethnic groups was significant 
at the 5% level, however. Peixotto 
concluded that the instrument re- 
flected characteristics common to any 
specific ethnic group, but varies be- 
tween groups and thus was not cul- 
ture free. One wishes for a cross- 
validation on a much larger sample 
for the N in each ethnic group was 
small in this study. Wolfsohn (1951— 
52), as reported previously in this re- 
view, believed the BG Test to be 
adequately culture free when it was 
applied as an intelligence test to 
Jewish immigrant children. 


BENDER GESTALT DISCERNMENT 
OF ORGANIC BRAIN PATHOLOGY 


Rotation of the BG designs seems 
to be widely accepted by clinicians 
as a discriminating factor in organic 
brain pathology. Yet, it occurs in the 
protocols obtained from subjects 
carrying other diagnostic labels who 
are considered clinically to be free of 
any brain pathology. One must keep 
in mind that in the literature re- 
viewed in this article, the brain 
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pathology generally ascribed to the 
experimental subjects, was of the 
lesion and/or blood circulatory types, 
rather than the toxic. The possibility 
of the presence of a metabolic toxic 
brain disturbance in functional psy- 
chosis certainly has not been dis- 
proven at this time. Thus, it is pos- 
sible that Bender figure rotations 
reflect brain chemical pathology too. 
Many studies already mentioned 
have included rotation as one of 
several factors investigated. There 
are several studies that seek to evalu- 
ate this one factor alone, however. 
Hanvik and Anderson (1950) com- 
pared a group of 44 brain damaged, 
hospitalized patients with a control 
group. They found 59% of the ex- 
perimental group produced one or 
more rotations of 30 degrees or more 
in the nine designs, while only 19% of 
the control group did so. Later, 
Hanvik (1953) examined the EEG 
reports of 20 of his child patients 
whose BG protocols contained 30% 
orgreaterrotations. Eighty percent of 
their EEGs were abnormal. Chorost, 
Spivack, and Levine(1959) attempted 
to validate Hanvik’s (1953) findings. 
Their subjects were 68 “children” 
below the age of 18. Of the 68 proto- 
cols, 51 had one or more design rota- 
tions, The EEGs of 69% of the 51 
subjects were abnormal. However, 
47% of the EEGs of the remaining 17 
subjects who did not rotate any BG 
designs were also abnormal, The au- 
thors concluded that in their sample 
the “success of the BG rotation test 
was not much better than chance 
probability.” The BG protocols of 
1,003 male NP veteran hospital pa- 


. tients were used by Griffith and 


Taylor (1960). They defined rotation 
as a movement of the design of 45 
degrees or more with the figure still 
recognizable. Fifty-six percent of the 
mentally deficient patients in their 


group gave protocols with such rota- 
tions. Forty percent of the chronic 
brain syndrome cases in their group 
gave protocols with rotations. These 
percentages were significantly higher 
than those found in the schizophrenic 
reaction group, neurotic group, and 
the character disorder groups. The 
diagnostic significance of rotations of 
BG designs is obviously not settled. 
Other pattern evoking stimuli are 
being studied for this characteristic. 
Perhaps these efforts should be con- 
solidated by administering the BG 
to the subjects being studied with 
other rotation evoking material and 
comparing results. We can hope also 
for more valid criteria of organic 
brain pathology than EEGs alone. 
A recall protocol has not been 
routine procedure in BG administra- 
tion to my knowledge. Apparently, 
Peek and Quast (1951) included it on 
the basis that it might “afford a 
measure of the learning taking place 
during the copying phase of the test, 
and (that it) fits the paradigm of one- 
trial serial learning under free recall 
conditions, with the important excep- 
tion that the S is not instructed to 
learn” (Goodstein et al., 1955). Since 
Peek and Quast published their 
manual while at the University of 
Minnesota, it is probable that they 
were influenced by Sullivan and 
Welch's (1948) work. The reader 
should have in mind the previously 
reviewed studies on the effects on the 
recall score of the serial position of the 
designs and their difficulty level 
when he considers the following re- 
ports. Hanvik and Anderson (1950) 
following Welch's lead, in addition to 
a rotation score, obtained a recall 
score by counting the number of 
designs correctly recalled following 
the usual reproduction procedure. 
The mean recall score did not dis- 
criminate significantly their brain 
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damaged group of patients from their 
normal group. Three recall scores 
were obtained by Neibuhr and Cohen 
(1956) for each of their subjects dur- 
ing one test period. Prior to the regu- 
lar test, they used a 10-second expo- 
sure of the design card followed by 
immediate recall of each design. 
After the regular procedure was com- 
pleted, a recognition memory score 
was obtained by asking the subject to 
select the correct design from a choice 
of six alternates constructed for each 
of the nine figures. Finally, after 
completing this procedure, they asked 
the subject to match one of the six 
alternates with the standard design. 
This permitted a second recognition 
score. They used 10 subjects in each 
of the four groups classified as normal 
nurses, acute schizophrenics, chronic 
schizophrenics, and neurological 
cases. They found a progressively 
increasing degree of perceptual mem- 
ory inefficiency in the order of diag- 
nostic groups listed. There was no 
doubt that the memory scores fully 
discriminated the neurological group, 
but they did not separate adequately 
the acute from the chronic schizo- 
phrenic reaction group. Tolor (1956) 
determined the number of designs 
correctly recalled after the usual 
reproduction procedure, and a score 
for digit span on 91 organics, 35 
convulsives, and 49 psychogenic pa- 
tients. The subjects ranged in age 
from 12 to 72 years and in IQ from 57 
to 136 on the Wechsler-Bellevue 
Intelligence Scale, Form I (W-B I. 
The recall scores discriminated the 
organic group more adequately from 
the psychogenic group than did the 
digit span. He cautioned against 
using this finding in interpreting indi- 
vidual cases. Later Tolor (1958) 
cross-validated this study and con- 
trolled the age variable better by 
group matching his character dis- 
order, schizophrenic, and organic 


subjects. This time the organic group 
made significantly poorer recall scores 
on both the BG Test and the Digit 
Span test, except for digits backwards. 
His other two groups were not dis- 
criminated by the immediately mem- 
ory tests. Reznikoff and Olin (1957) 
sought to validate Tolor's first study. 
They selected their "organics" from 
Tolor's original population. Their 
recall score was the number of whole 
designs accurately recalled. They 
found the score discriminated their 
organic subjects from their schizo- 
phrenic subjects at the .05 level of 
confidence. These authors (Olin & 
Reznikoff, 1958) then developed a 
somewhat different recall score by 
modifying P&S’s scoring system. This 
score significantly discriminated their 
organic brain damage adult patient 
group and their schizophrenic adult 
patient group without brain damage 
from “normal” nurses. The score did 
not, however, discriminate the two 
patient groups. Stewart and Cun- 
ningham (1958) published a study of 
incidental interest here. They ob- 
tained a modified P&S recall score 
on 18 psychotic patients, 17 non- 
psychotic but NP patients, and 
20 nurse subjects. All subjects were 
females and ranged in age from 15 to 
59. The group means discriminated 
their groups significantly. Obviously, 
the results are not definite, but do 
direct attention to the greater effi- 
ciency of certain types of recall scores 
over others. Niebuhr and Cohen's 
(1956) procedure needs cross-validat- 
ing. Likewise, more realistic controls 
of intelligence and other variables 
need to be introduced into future 
investigations of  Welch's recall 
method. These further research 
efforts are urged with the expectation 
of strengthening the instrument's 
overall validity in the detection of the 
presence of organic brain pathology. 

Several studies have dealt with the 
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question of discerning the presence of 
organic brain pathology through the 
medium of the BG protocol by em- 
ploying individualized approaches. 
Guertin (1954) was concerned over 
the tendency of psychologists and 
psychiatrists to ascribe certain test 
signs and symptoms to organic pa- 
tients without regard to the ''varied 
etiology and foci of pathology" in the 
brain. He obtained BG protocols 
from 27 male adult schizophrenic 
subjects with readily recognizable 
organic brain pathology. A trans- 
posed factor analysis of the correla- 
tion matrix did not discriminate ade- 
quately factors that separated the 
patient's "organicity" from their 
psychosis and so he urged extreme 
caution in utilizing organic test signs 
without reference to etiolog For 
example, can the BG Test bused to 
detect the psychiatric disturbances 
aroused in normal subjects by LSD- 
25? Abramson, Waxenburg, Levine, 
Kaufman, and Kornetsky’s (1955) 
results on 25 subjects tended to indi- 
cate they were psychiatrically dis- 
turbed but did not distinguish the 
etiology of the disturbance. On the 
other hand, Hirschenfang (1960) ob- 
tained BG protocols from 25 right, 
and 25 left hemiplegic hospitalized 
patients. The age range for the right 
group was 22-78 years. The age 
range for the left group was 46-83 
years. Sex and intelligence were not 
controlled, He scored the protocols 
by the P&S system and found result- 
ing group scores significantly dis- 
criminated the left group as being 
'*worse," 

The efficacy of the BG Test versus 
flicker-fusion as devices for demon- 
strating brain pathology was studied 
by McGuire (1960). He tested two 
groups of 21 Navy men, matched for 
age, of whom the experimental sub- 
jects had known organic brain dam- 
age. He sent their BG protocols to 
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five judges, three of which had had 
long experience with the BG Test. 
His flicker-fusion score correctly 
labeled 30 of the 42 subjects. Three 
of the judges correctly labeled 29 of 
the 42 subjects. 


BENDER GESTALT AS A MEASURE 
OF INTELLIGENCE 


Bender' Mental Age Scale for 
children and recent studies of it have 
already been discussed. Koppitz’ 
(1958a, 1960a) scoring system for 
children's protocols has been re- 
viewed earlier. She administered the 
WISC to 90 children for whom she 
had BG protocols (Koppitz, 1958b). 
All children were clinic patients with 
school maladjustment due to learning 
or behavior difficulties. Their age 
range was from 6-7-11-7 and the 
WISC range in IQ was from 73 to 126. 
All of the WISC scores correlated 
significantly with her BG scores ex- 
cept those from the Coding, Informa- 
tion, Comprehension, and Similari- 
ties subtests. Likewise, Koppitz, 
Sullivan, Blyth, and Shelton (1959) 
found that the Koppitz BG scores 
from the protocols of 143 first grade 
students correlated significantly with 
their Metropolitan Achievement Test 
scores. 

Some findings are available on 
“adult” subjects. For instance, Tolor 
(1956) correlated the usual BG re- 
call scores on 175 NP patients with 
an age range of 12-72 years, with 
their W-B I Total IQs; the r=.50. 
Also, Peek and Olsen (1955) obtained 
a BG recall score, number of correct 
whole and one-half figures, on 193 
hospitalized male and female pa- 
tients, aged 14-72 years. They re- 
ported an r of .34 (p=.001) between 
that score and the patient’s Shipley- 
Hartford CQ score. Since there was 
an essentially .00 correlation between 
the recall and Shipley-Hartford raw 
scores, they concluded the BG was 
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evaluating intellectual efficiency, but 
not intellectual capacity. A similar 
.00 correlation between Shipley-Hart- 
ford "intelligence" and a 1.5 recall 
score was reported earlier by Aaron- 
son, Nelson, and Holt (1953). Aaron- 
son (1957) obtained BG protocols on 
42 male and 46 female epileptics who 
had also been given the Porteus 
Mazes. The recall score was obtained 
by the 1.5 procedure. The initial 
correlation of .46 between the pa- 
tient's recall score and Porteus quo- 
tient reduced to .21 when age was 
partialled out. Lastly, Peek and 
Storms (1958) asked three trained 
judges to estimate the intellectual 
level of BG protocols obtained from 
100 NP patients without known or- 
ganic brain damage. The subjects 
were of both sexes and ranged in age 
from 16 to 59 years. The correlation 
between the judges ranking and 
Shipley-Hartford T scores was not 
significant. The proceeding studies 
seem to substantiate further the 
earlier findings that the BG Test is 
useful in estimating the intelligence 
of children ranging from 4-0 to 11-11 
years, but is inadequate for this pur- 
PE with younger and older individ- 
uals. 


DETERMINING PERSONALITY Psy- 
CHODYNAMICS WITH THE BENDER 
GESTALT 


The reader is referred to earlier 
material on Hutt's scoring system 
and the studies concerned with free 
associations to the BG designs. No 
direction in the research results is 
apparent in this area, so the next 
studies are presented simply in order 
of publication. Wohl (1957) sought 
for the degree of generalization of 
“constriction” in an individual's BG 
designs, Rorschach responses, The- 
matic Apperception Test (T AT) stor- 
ies, an interest test, and in a semantic 
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differential device. Constriction was 
nol revealed to begeneralized through- 
out these tests by an intercorrelation 
procedure. Zolik (1958) obtained 
P&S's Z scores on the BG protocols 
of 43 delinquents and 43 nondelin- 
quents, aged 14-17. The groups were 
matched on age, IQ, and lack of 
"motor defect." A cutoff score of 
Z=60 correctly grouped 69% of the 
delinquents and excluded 9597 of the 
controls. ^ Koppitz (1960b) took 
matched groups of 16 first grade 
students each and obtained BG and 
Draw-A-Person protocols from each 
subject. One group studied under a 
relaxed nontension producing teacher, 
the other under a nervous, tension 
producing teacher. The BG protocols 
scored for tension did not discrimi- 
nate between the two groups, whereas 
the Draw-A-Person scores did.? Like 
Wohl’s earlier study Prado, Peyman, 
and Lacey (1960) tested if the BG 
designs could be used to predict 
“flattened affect” psychiatrically 
judged to be present or not present in 
120 NP patients and normals. They 
carefully and objectively scored the 
designs in a manner similar to 
Billingslea's method. The results did 
not discriminate between the groups. 
Gavales and Millon (1960) obtained 
the usual recall scores and a size 
Score on the protocols of 80 college 
students who had also taken the 
Taylor Manifest Anxiety scale (MA 
scale). They divided their subjects 
into two groups of 40 each, a high 
MA scale score and a low MA scale 
Score group. Twenty subjects from 
each of the two groups took the BG 
under artificial social stress. Both 
groups of "'stressed" subjects tended 


* Clawson's 1959 report is a survey of her 
PhD dissertation completed in 1958. In 1962 
she released the data in the form of a manual, 
available at Western Psychological Service 
in Los Angeles, 
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to produce smaller figures regardless 
of their MA scale scores. The same 
size relationship was found in their 
recall scores also. Further, all group 
recall scores showed reduction in size 
from the reproduction design scores. 
Ontheother hand, Lachmann, Bailey, 
and Berrick (1961) found that clinical 
psychologists’ judgments for the 
presence of anxiety in BG and Draw- 
A-Person protocols were not consist- 
ent and their judgments did not agree 
with the WA scale score. It is evident 
that attempts to substantiate, experi- 
mentally, this use of the BG con- 
tinues to be unrewarding. Most of the 
difficulty stems from the fact that the 
characteristic under consideration 
fails to appear consistently through- 
out the whole protocol. The factor 
size remains the one possible excep- 
tion to this generalization. It would 
appear that the individual designs 
are reacted to by subjects as if each 
is a symbolically discrete design. If 
this conclusion is correct, then each 
design must be studied individually 
for what it has to offer psychody- 
namically and the sqeuential influ- 
ence of one design on the others must 
be taken into account. Finally, the 
perceptual importance of other ex- 
pressive behavior patterns that occur 
during the drawing period but seldom 
have been included in the published 
research need careful investigation. 


BENDER GESTALT AS A STANDARD 
FOR JUDGING THE EFFECTS OF 
TREATMENT PROCEDURES 


When employing a test as a stand- 
ard in a study just once, the experi- 
menter is forced to use norms ob- 
tained from other subjects. When 
the instrument is used two or more 
times ina design, the resulting data 
can furnish their own controls or 
"norms." Byrd (1956) was interested 
in selecting children in need of psy- 
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chotherapy by means of the BG 
Test. Using 200 maladjusted sub- 
jects 8-16 years old and a normal 
control group, he found that only 
orderly sequence, change in curva- 
ture, closure difficulty, and rotation 
of his 15-factor indices tended to 
discriminate the groups. Ligthart, 
Johnston, and Sussman (1956) ad- 
ministered the BG and W-B I to a 
group of adult NP patients before 
and after they were given an electro- 
shock therapy (EST) with Coramine 
treatment series. They assumed 
P&S's Z scores would be indicative 
of the amount of psychopathology 
present. In an interesting variation, 
Crasilneck and Michael (1957) de- 
sired to know the absolute and rela- 
tive age levels of behavior that would 
occur when they introduced various 
conditions, one of which was hyp- 
notic age regression. They assumed 
the BG protocols would disclose the 
MA levels on which a subject was 
functioning at a given time without 
his knowing how to depict this on the 
test. The subjects, 10 white nurses 
aged 19-22, were given the W-B I 
and the BG first while awake. Next, 
still awake, they were asked to repro- 
duce the designs as they thought they 
would if they were 4 years old. Next, 
while hypnotized to the somnambu- 
listic state, they again reproduced 
the figures as they thought they 
would if they were 4 years old. 
Finally, they regressed to the 4-year- 
age level hypnotically and were asked 
to reproduce the designs, The mean 
MAs for the four groups of protocols 
were 11.2, 9.9, 7.8, and 7.3 years, 
respectively. Lothrop (1958) won- 
dered if the BG with P&S scoring 
would discriminate between a group 
of nine male veterans who responded 
successfully to medical treatment for 
their duodenal ulcer from a similar 
group which did not. The IQs for the 
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successful group ranged from 71 to 
126, while the failure group's ranged 
from 66 to 110. The correlation of age 
with IQ was significant. The group 
BG means did not even overlap. 
Schon and Waxenberg (1958) ob- 
tained pre- and posthospital BG pro- 
tocols from hypophysectomized sub- 
jects and developed P&S's scores for 
them. The scores worsened signifi- 
cantly after surgery when they re- 
sembled those of psychiatric pa- 
tients. Pre- and post-W-B I scores 
were not significantly different. Fin- 
ally, Higbee, Clark, and Henderson 
(1960) found P&S scored BG proto- 
cols did not differentiate the lengths 
of hospital stays of 72 male veterans. 
If this survey accomplishes little else, 
it should demonstrate that the BG as 
yet has not reached a stage of devel- 
opment where it can be employed to 
evaluate the effects of other variables. 


PERSPECTIVE 


Although there are surprisingly few 
generalizations that can be made 
about the BG Test as a result of 
reviewing the published experiences 
with it over the past decade, the fol- 
lowing seem justified in the light of 
the preceding discussion: 

1. The test continues to be popular 
with clinicians and deserves to remain 
as an additional tool in his repertoire. 

2. It is in great need of universally 
accepted standard set of designs. 

3. The P&S scoring system has 
proven useful on adult protocols as 
has Koppitz’ modification of it on 
children’s protocols. 

4. Reasonably valid MAs can be 
obtained with it for children 4-12 
years and adults with equivalent 
MAs, but not adolescents and adults 
with higher MAs. 

5. It can be employed as an addi- 


tional tool in a battery of tests ad- 
ministered to an individual when 
clues for the possible presence of or- 
ganic brain pathology are sought. 

6. Whether evaluated with objec- 
tive scores or with some systematic 
inspection procedure the results tend 
to discriminate the psychotic from 
the nonpsychotic and nonpsychiatric 
subject provided their MAs are 13 or 
above. It does not detect effectively 
nonpsychotic emotionally disturbed 
children, however. 

7. When the protocols are inter- 
preted symbolically, the clinician 
must rely almost completely on the 
validity of his own subjective profes- 
sional knowledge. 

8. The test has not been standard- 
ized sufficiently to permit its use as a 
norm against which to judge other 
variables. 

9. More research is needed on the 
perceptual contributions of each de- 
sign and the effects on such percep- 
tions of their sequential appearance 
in the protocol. 
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CONGENITAL INSENSITIVITY TO PAIN: 
A CRITIQUE! 


RICHARD A. STERNBACH? 
Massachusetts General Hospital, Boston 


No available reported case of apparent congenital pain insensitivity 
meets strict requirements for the syndrome. 17 ''probable" cases are so 
neurologically and behaviorally heterogeneous that there appear to be 
several kinds of insensitivities with variations in the nature and/or 
locus of their neural deficits. The possible kinds of such deficits are dis- 
cussed. The ability of these persons to survive is seriously impaired and 
depends on their ability to use other sensory cues of tissue damage. 
Normal personality development is rarely affected by the absence of 


pain. 


Persons who, in the absence of 
other abnormality, consistently fail 
to report noxious stimuli as ''pain- 
ful," and show no aversive behavior 
to such stimuli, are of major interest 
for several reasons: pain is said to be 
an important signal for self-perserva- 
tive responses, pain is often con- 
sidered the unconditioned stimulus 
for conditioned anxiety and thus an 
important factor in personality de- 
velopment, and pain is often thought 
to be a patterning of peripheral 
neural activity along pathways that 
also serve other modalities. Conse- 
quently, when persons who are other- 
wise healthy demonstrate, from birth, 
an insensitivity to pain, one is curious 
to know whether such persons can 
avoid serious injury; whether they 
can be taught to conform to the de- 
mands of the culture; and how the 
sense of pain can be missing without 
impairment of other modalities. The 
importance of these patients is fur- 
ther suggested by the fact that they 
have received special consideration in 


1 F, Nowell Jones first nterested me in this 
problem. I am greatly indebted to him and 
to Stanley Cobb, James C. White, Gardner C. 
Quarton, Frank R. Ervin, and Norman 
Geschwind, who have read this paper and 
offered useful suggestions and criticism. 

2 Now at Massachusetts Mental Health 
Center, Boston. 
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a recent preliminary theory of pain 
(Barber, 1959) and have been shown 
to be a serious problem to derived- 
drive theories of motivation (McMur- 
ray, 1955). Finally, the mechanisms 
involved in congenital insensitivity 
to pain are of particular interest to 
those concerned with understanding 
the physiological mechanisms of pain 
and in applying such knowledge to 
the treatment of intractable pain 
(see Ervin & Sternbach, 1960; Mark, 
Ervin, & Hackett, 1960). 

With respect to the above prob- 
lems this paper presents an evalua- 
tive review of the literature and a 
consideration of some of the issues 
raised by this disorder. 


SURVEY OF THE CASES 


Among the many persons cited as 
insensitive to pain are those who, 
upon careful analysis, appear rather 
to be hysterics, mental defectives, 
psychotics, and individuals with sub- 
sequently recognized peripheral nerve 
disease. There are a variety of dis- 
eases whose symptoms or sequelae 
include an insensitivity to pain which, 
being remarkable, is sometimes given 
undue weight. Burr (1900-01) had 
one hysteric patient and one with 
brain damage. Berkley's (1891, 
1900) cases were luetic, as were 
Carezzano's (1928). Holzer's (1896) 
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patient probably had syringomyelia. 
Heyne's (1890) patient developed his 
symptoms following typhus and 
roseola with a high fever. Ziemssen’s 
(1890) patient seems to have had a 
stroke. The severe deficits illustrated 
in the patient of Paris and Lafforgue 
(1909) reflect considerable brain dam- 
age. Ortiz de Zarate’s (1955) interest- 
ing case seems to reflect a congeni- 
tal absence of free nerve endings. 
Moffie’s (1951, 1952) patient, though 
apparently pain-free early in life, had 
a variety of neurological disorders by 
the time he was carefully studied. 
Kunkle and Chapman's (1943) case 
had other neurological complications. 

Examples of the association of in- 
sensitivity to pain with retardation 
are found in Arbuse, Cantor, and 
Barenberg (1949), Roe (1950), 
Farquhar and Sutton (1951), Keizer 
(1951), and Madonick (1954a,1954b). 
Couston (1954) shows how prevalent 
may be the insensitivity to pain 
among mental defectives, but 
Schachter (1956) disputes this. Vega 
(1949) presents one of the few in- 
stances of familial insensitivity to 
pain, but the whole family seems to 
have been severely retarded. 

A special category must be made 
for "asymbolia" for pain, as reported 
by Schilder and Stengel (1931), 
Hemphill and Stengel (1940), and 
Rubins and Friedman (1948). These 
cases involve lesions of the parietal 
lobe which, while not preventing the 
recognition of painful stimuli, appear 
to prevent the integration necessary 
for withdrawal. Since the behavioral 
response ("indifference," see below) 
to noxious stimuli is often similar to 
that of persons with congenital in- 
sensitivity to pain, they may er- 
roneously be classified with the latter. 


Criteria 


McMurray (1955) has pointed out 
that, for a case to be classified as 


congenitally insensitive to pain, the 
definition of each word in the name of 
syndrome serves as a criterion to be 
met. Thus, the defect must be pres- 
ent from birth, rather than acquired 
as a.possible secondary manifestation 
of a disease process or traumatic in- 
jury; there must bea general insensi- 
tivity to pain, i.e., an insensitivity to 
a variety of potentially noxious 
stimuli over the entire body, with no 
or slight involvement of the other 
sensory modalities; and there must be 
no general mental or physical retarda- 
tion. In short, persons with con- 
genital insensitivity to pain must, 
strictly speaking, be “normal” in 
every respect other than this defect. 
Ogden, Robert, and Carmichael 
(1959) also make a careful distinction 
between this defect and the similar- 
appearing sensory neuropathies which 
have received considerable attention 
recently (Denny-Brown, 1951; Man- 
dell & Smith, 1960; Munro, 1956; 
Parks & Staples, 1945; Walker, 
1955-56). These authors suggest that 
a careful neurological examination 
can distinguish among three “sensory 
syndromes”: (a) progressive sensory 
radicular neuropathy, a hereditary 
disease which begins with degenera- 
tion of the sensory neurons in the 
extremities; (b) nonprogressive sen- 
sory neuropathy, of unknown etiol- 
ogy, which may involve cranial and 
thoracic nerves as well as the limbs, 
and which differs from congenital 
insensitivity to pain in that deep 
tendon and axon reflexes are absent 
in the involved areas, there are other 
sensory deficits than pain, and de- 
myelination is apparent in sensory 
nerve biopsies; (c) congenital insensi- 
tivity to pain, in which sensory nerve 
biopsies appear normal and the pre- 
sumed "lesion," if any, is central to 
peripheral receptors and fibers. | 
In this connection it is interesting 
to note that one patient who has been 
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reported as having congenital insensi- 
tivity to pain is now known, in fact, 
to have hereditary sensory radicular 
neuropathy.* 

Employing the conservative cri- 
teria suggested by McMurray (1955) 
and Ogden, Robert, and Carmichael 
(1959), and adding our own rules that 
the patients should be adolescents or 
older (for reasons given below) and 
must never have felt pain, there is no 
case which appears to be a genuine in- 
stance of congenital insensitivity to 
pain. The patient who has come 
closest to being pain-free is Miss C., 
the young lady reported first by 
McMurray (1950). This woman has 
been thoroughly studied, and al- 
though no experimental procedure or 
life trauma ever evoked an apparent 
sensation of pain, she did experience 
pain a month before her death 

(Baxter & Olszewski, 1960). We will 
summarize the several reports on her 
later in this paper. 
Classification 

If we classify all the reports in the 
literature into three groups, ''cer- 
tain," “probable,” and “unlikely,” 
there is no case that falls into the first 
group. However, we will reserve our 
Group I for possible future certain 
cases; i.e., those who may meet all 
the criteria mentioned above. 

Group II includes the probable 
cases of congenital insensitivity to 
pain, and included here are those 
adults, like Miss C., who rarely in 
their life felt pain, and some children. 
Dearborn's (1932) famous Human 
Pincushion, studied when 54 years 
old, experienced pain only three times 
in his life: at the age of 7 he had a 
headache for a few days after an axe 
was buried in his skull, at the age of 


3 N. Geschwind, personal communication, 
1961. 
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14 he felt pain "for an instant" when 
a surgeon probed for a bullet in his 
finger, and at age 16 the setting of a 
broken fibula "hurt a little." Simi- 
larly, Jewesbury's (1951) Case 1, a 
34-year-old male, had felt pain once 
when he had a smashed finger and 
once when kicked in the testes. The 
same author's Case 3, a 76-year-old 
male, had a painless coronary throm- 
bosis and had experienced pain only 
twice in his life: in 1901 he had a 
severe headache and vomiting for 1 
day; more recently he suffered from 
urine retention due to an enlarged 
prostate. Kipnis, Cohen, Kubzansky, 
and Kunkle (1954) and (same pa- 
tient) Cohen, Kipnis, Kunkle, and 
Kubzansky (1955) report a 19-year- 
old girl whose only painful experience 
was at age 16, when she was ambula- 
tory 12 hours postappendectomy and 
developed a throbbing headache 
which disappeared upon lying down, 
a probable postpuncture reaction. 
Ervin and Sternbach's (1960) Cases 
1 and 3 are also probable and never 
experienced pain, but it was not pos- 
sible to obtain verifying biopsies. 

There are three definite cases of 
children who demonstrated an early 
insensitivity to pain, but later lost 
this insensitivity (Fanconi & Fer- 
razzini, 1959, Case 3; Jewesbury, 
1951, Case 4; Rose, 1953). Another 
child was probably also becoming 
sensitive to pain (Ford & Wilkins, 
1938, Case 1). Because of this un- 
predictable improvement (which may 
indicate retarded development of 
some central sensory center), and be- 
cause children are more difficult to 
examine satisfactorily and have had a 
much briefer time in which to experi- 
ence life's noxious stimuli, we have 
chosen to list them in Group II as 
probable rather than certain, even 
though their histories are rather im- 
pressive with respect to the syndrome. 
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Here we would include the cases of 
Boyd and Nie (1949); Nissler and 
Parnitzke (1951); Cerny-Waldvogel 
(1952); Westlake (1952); Girard, 
Devic, and Garin (1953); Julito and 
Brotto (1955); Lamy, Garcin, Jam- 
met, Aussannaire, Lambert, Thiriez, 
and Grasset (1956); Jéquier and 
Deller (1956); Schachter (1956); 
Durand and Belotti (1957); Ervin 
and Sternbach’s (1960) Case 2. 

Thus, of the many citations of in- 
sensitivity to pain, none is certain, 
and only 17 are probable; and we 
shall see that these 17 are quite dis- 
similar in some important respects. 

In Group III we would include as 
unlikely the many cases cited above 
which are contaminated with other 
symptoms or are the result of disease 
or injury; and those other patients 
who, like Jewesbury's (1951) Case 2 
(who could let molten lead at 800 
degrees F. splash on him without 
pain), experienced visceral pain with 
pyelitis, appendicitis, headaches, etc. 


Insensitivity or Indifference 


Although many of the earlier 
papers use the term “indifference” to 
pain, we have preferred to use “in- 
sensitivity’ to pain, and it seems 
worthwhile to note briefly the impli- 
cations in each. A cautious, be- 
havioristically-oriented investigator 
recognizes the difficulties attendant 
upon any attempt to assess another's 
sensations. It is easier and more ac- 
curate to describe the other's be- 
havior. Accordingly, those who fail 
to withdraw from or avert noxious 
stimuli may reasonably be described 
as indifferent to pain, with no judg- 
ment implied by the investigator as 
to whether or not the subject “senses” 
pain. 

While we agree in principle with 
this approach, it seems to us that the 
term ‘‘indifferent’’ confuses the pres- 


ent syndrome with others in which 
the patient shows no obvious be- 
havioral change to the stimuli, Thus 
we feel it more accurate to label the 
lobotomized and the asymbolia pa- 
tients as indifferent to pain, for these 
persons often admit to experiencing 
pain but make no attempt to escape 
or prevent it. The subjects of this 
paper, on the other hand, seem not to 
experience the sensation of pain. 
Potentially noxious stimuli of many 
kinds are perceived (and often 
avoided), but the patients will report 
them as “itching,” “tingling,” “tick- 
ling," etc., asif the pattern of afferent 
impulses had been recoded. It seems 
reasonable to us, therefore, to label 
the present group as insensitive to 
pain (but not to the stimulus: Barber, 
1959; Critchley, 1956; McMurray, 
1955) and to reserve the term indiffer- 
ent for lobotomized and asymbolia 
patients. 

It might be noted, parenthetically, 
that there is a high incidence of injury 
among those whom we are calling in- 
sensitive (see below), but that injury 
occurs relatively infrequently among 
those who are lobotomized or are 
receiving morphine. This behavioral 
distinction is an additional reason for 
considering our insensitive group 
separately. 


Best Documented Case 


Of all the cases available, McMur- 
ray's (1950) patient comes closest to 
being in Group I as certain, and this 
patient is also the one most thor- 
oughly studied. We will summarize 
briefly the several papers reporting 
various aspects of her case: i 

McMurray (1950) reported on a 22-year-old 
white woman, Canadian born of British 
ancestry and a university student, who had 
all sense modalities except pain “intact. 
There was ao E di bn) and the 
gag reflex cou eli on! stimulating 
low down on the pharyngeal walt, 
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Corneal reflexes were absent in both eyes. 
She could distinguish warm from cool, even 
when the difference was not great. Histamine 
(.1 cc of 1:1000 solution histamine phosphate 
IV) produced a slight taste in the mouth, a 
throbbing sensation, specks before the eyes, 
an increased pulse rate and throbbing of 
arteries in the neck, but no headache. There 
was no pain on muscle ischemia test with 
hand dynamometer, nor with electric shock 
from an inductorium, nor from inserting a 
stick up through the nostrils, etc. No sig- 
nificant alterations in blood pressure, heart 
rate, or respiration occurred to cold water 
(09-2? C.), hot water (497-51? C.), or electric 
shock, but the subject reacted on these vari- 
ables as did controls to a difficult size dis- 
crimination test, mirror drawing task, and 
exercise. Her IQ was high (Wechsler- 
Bellevue=128), and her personality as 
judged from Rorschach, Thematic Appercep- 
tion Test, Cornell Index, McFarland-Seitz 
Psychosomatic Inventory, and from inter- 
i normal. 


views, appeared E 
Later Petrie (1953) reported on the serious 
orthopedic procedures that were necessary on 
this patient, including a "shelf" operation of 
the right hip, and a decompression and fusion 
of the spine. He concludes, 
In this case changes identical with those of 
"Charcot's joints" in the knee, hip and 
spine occurred in a person with no other 
defect in her nervous functions (except loss 
of pain-sensation) until the spinal lesion 
itself produced a paraplegia. It seems 
probable that these joint changes were due 
soley to the lack of protection usually 
given by the sensation of pain (p. 400). 
Feindel (1953) supplied a histological study 
of the nerve endings of this patient, from 
biopsies from the skin and periosteum of the 
right hip taken at the time of operation; he 
found normal nerve endings from both regions, 
A month before her death she complained 
of discomfort, tenderness and pain in the left 
hip, of which X rays showed a partial sub- 
luxation and destruction of the left femoral 
head. Pain was relieved by analgesic tablets. 
The patient died in 1955, at the age of 29, 
from bronchopneumonia and amyloidosis. 
On autopsy, Baxter and Olszewski (1960) re- 
port, 
We have failed, therefore, to demonstrate 
any anatomical abnormalities of those 
nervous system structures thought to be 
concerned with the transmission, elabora- 
tion and perception of pain impulses (p. 
392). 
They note, however, that the defect might 
have been either submicroscopic (slight varia- 


tions in fiber size or malposition of the nodes 
of Ranvier), or in terms of organization rather 
than structure (slight changes in synaptic 
relationships). Baxter also made these ob- 
servations in discussing our cases (Ervin & 
Sternbach, 1960). Finally, Werner (see 
Jéquier & Deller, 1956) mentions having seen 
the patient. 


Familial Instances 


Familial occurrences of the syn- 
drome are very rare. Von Hagen (see 
Boyd & Nie, 1949) reports that 
Nielsen (Los Angeles) saw two girls, 
sisters, with congenital insensitivity 
to pain, and their hyperalgesic 
brother; their family history appeared 
negative with respect to the syn- 
drome. Critchley (1934) described a 
single case and wrote that Schiller 
referred to a similar patient in whom 
there was a familial incidence of this 
syndrome. Fanconi and Ferrazzini 
(1957) report a sister and brother 
(Cases 1 and 2), with a normal 
brother between them in age, and 
hypothesize a recessive inheritance 
(the parents were blood-related). 
Weddell (see Ervin & Sternbach, 
1960), discussing our family of cases, 
mentioned seeing two brothers with 
this syndrome. These citations, with 
Vega's (1949) retarded family men- 
tioned above, are the only instances 
we could find of familial insensitivity 
to pain.‘ f 

Figure 1 shows the family tree of 
our cases of insensitivity to pain and 
supports our contention that the syn- 
drome is a hereditary one in this 
family. As is seen, some are com- 
pletelyinsensitive,some have elevated 


* This footnote, and references cited therein, 
added in proof: Magee, Schneider, and Rosen- 
zweig (1961) report three cases, one of whom 
claimed five of his six sons were indifferent to 
pain, but he would not permit their examina- 
tion. Schneider (1962) considers the syndrome 
a defense mechanism. We argue against this 
view in an exchange with Schneider (Psy- 
chosom. Med., in press.) 
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thresholds to pain, and others are 
quite normal. An extensive series of 
blood groupings failed to show any 
patterning with respect to the syn- 
drome; nevertheless we would propose 
a dominant genetic factor with vary- 
ing degrees of penetrance. 


Heterogeneity of the Syndrome 


Although the papers cited in 
Group Il, above, appear to present 
authentic probable cases of con- 
genital insensitivity to pain, it must 
be emphasized that the group of pa- 
tients is by no means homogeneous. 
For example, some show autonomic 
responses to noxious stimuli (Cohen 
et al., 1955; Durand & Belotti, 1957), 
while others do not (Cerny-Wald- 
vogel, 1952; McMurray, 1950). Some 
fail to show a ciliospinal reflex (Cohen 
et al., 1955; Jewesbury, 1951, Case 1; 
McMurray, 1950) which is present in 
others. Where one young coed shows 
some impairment of temperature dis- 
crimination (Cohen et al., 1955), her 
Canadian counterpart is normal in 
this respect (McMurray, 1950). Some 
children may react to hot and cold as 
to tepid (Durand & Belotti, 1957; 
Girard et al., 1953), while others may 
find extreme cold “unpleasant” 
(Jéquier & Deller, 1956; Julião & 
Brotto, 1955). Trophic disturbances 
may be excessive (Petrie, 1953) or 
absent (Ervin & Sternbach, 1960), or 
range between these. With respect to 
behavior, both college students cited 
above were normal, but another pa- 
tient was said to have been a be- 
havior problem (Juliào & Brotto, 
1955). Self-mutilation and disfigure- 
ment are frequent (Boyd & Nie, 
1949; Durand & Belotti, 1957) but 
not universal (Jewesbury, 1951, Cases 
1 and 3; Ervin & Sternbach, 1960, 
Cases 1, 2, and 3). Critchley (1956) 
has also commented on such differ- 
ences among these cases. 
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From this lack of homogeneity it is 
apparent that there is no single syn- 
drome of congenital insensitivity to 
pain, but several kinds of ‘‘insensi- 
tivities,” varying in degree of sever- 
ity, and associated with different 
neurological and behavioral signs. 
It is therefore likely that the neural 
anomaly or deficit responsible, 
whether in site or in function, differs 
considerably among these patients. 


NEURAL DEFECTS 


In all the patients cited in Group 
II from whom biopsies have been 
taken, the distribution and appear- 
ance of nerve endings has been nor- 
mal. Indeed, this is an important 
criterion for differentiating the con- 
genitally pain-free from the sensory 
neuropathies. In terms of clinical 
function, these persons are reported 
to have normal or only slightly im- 
paired sensitivity to touch, vibration, 
temperature, two-point discrimina- 
tion, pinprick, etc. There is some 
reason to suspect, however, that gen- 
eral cutaneous and visceral sensitivity 
is always impaired to some degree in 
these persons, and that this impair- 
ment becomes apparent when careful 
testing for absolute and difference 
thresholds prevents the use of alterna- 
tive cues. Weddell and Mark (see 
Ervin & Sternbach, 1960), who have 
seen several such patients, made this 
observation in discussing our cases 
and so has Geschwind.’ 

Although Nissler and Parnitzke 
(1951) think the selective loss of the 
pain sense argues for a physiological- 
anatomical independence of the pain 
system, this view is probably not 
valid since the same peripheral fibers 
respond to light touch, pressure, tem- 
perature, and pain (Wall, 1960). It 
should be noted, however, that the 
issue of receptor or fiber specificity is 
by no means settled. Although 
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workers like Lele, Weddell, and Wil- 
liams (1954) and Wall (1960) find 
evidence for multiple adequate stim- 
uli for a given sensory nerve ending, 
others like Loewenstein (1961) pro- 
vide evidence supporting the tradi- 
tional view. Furthermore, although 
it may be assumed that peripheral 
defects might account for pain insensi- 
tivity, this hypothesis would require 
total bodily distribution of uniform 
defects, a rather stringent require- 
ment. The anatomical and clinical 
evidence rather support the notion 
that the neural deficit in these pa- 
tients is not a peripheral one, but 
central, a hypothesis advanced by 
most of the investigators who have 
discussed these cases. Furthermore, 
it is likely that the deficit must be 
present at the level of some central 
structure where "coding" of afferent 
impulses occurs. 

Several sites may be implicated, 
and there is no reason to believe that 
a deficit hypothesized at one level is 
any more “parsimonious” than one 
hypothesized at another. For ex- 
ample, a good case can be made for 
the dorsal horn of the cord where 
Wall (1960) has found that the same 
primary cells respond to light touch, 
deep pressure, and temperature 
changes on the skin. A slight varia- 
tion in synaptic relationships at this 
level could so alter the patterns of 
impulses arriving at the thalamus or 
the cortex that noxious simuli, al- 
though perceived and describable by 
the patient, are not interpreted as 
painful. Similarly, a slight decrease 
in the size of fibers relaying cephalad 
from these cells could so increase 
their thresholds that the patterns of 
impulses to noxious stimuli are not 
different from those to other sensory 
events, In those persons who have 
been cited as showing no significant 
changes in autonomic activity to 


noxious stimuli, some defect in neural 
organization is likely at this level. 

At higher levels there is a conver- 
gence of fibers, in the paramedial 
medulla and the ventral tegmentum 
of the midbrain, which respond to 
stimulation of peripheral unmyeli- 
nated nerves (Collins & Randt, 1958, 
1960) and which may be presumed to 
be the pathways subserving pain. 
Here, too, slight variations in fiber 
sizes may distort the spatiotemporal 
patterns of impulses that signal 
“pain.” 

Brain stem lesions in the spino- 
thalamic and in the central grey path- 
ways effectively reduce pain percep- 
tion (Melzack, Stotler, & Livingston, 
1958). Since the spinothalamic fibers 
contribute multisynaptic branches to 
the central grey and the reticular 
formation and the diffusely project- 
ing thalamic nuclei, we may expect 
that impaired synaptic functions at 
this level will inhibit the perception of 
pain, while the larger sensory fibers 
of the lemniscal tracts which con- 
tribute directly to the sensory nuclei 
of the thalamus and relay to the 
cortex will function normally. Thus, 
posterior medial thalamic lesions have 
produced loss of pain in terminal 
cancer patients, with little or no 
sensory deficits (Mark, Ervin, & 
Hackett, 1960). 

Any of the above structures may be 
the sites where defects in neural or- 
ganization exist in the congenitally 
pain-free. The failure to detect any 
anatomical alteration in the careful 
histological study of the one patient 
autopsied (Baxter & Olszewski, 1960) 
suggests that the deficit must be a 
very slight one structurally; and the 
great variability apparent among the 
other patients, particularly with re- 
spect to the autonomically-mediated 
responses to noxious stimuli, suggest 
strongly that the locus of this deficit 
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must be quite variable from patient 
to patient. 


SURVIVAL WITHOUT PAIN 


At several points thus far we have 
alluded to the variety of traumata to 
which pain-free flesh may be heir, and 
we propose now to discuss this prob- 
lem specifically. 

Young children with this syndrome 
have mutilated themselves by chew- 
ing off the tips of the fingers and/or 
tongue, by picking off the nares, and 
by suffering severe burns when lean- 
ing against stoves or sitting in scald- 
ing baths. Ziegler (see Madonick, 
1954a) reported that he saw an 11- 
month-old infant which chewed its 
fingers and tongue, banged its head 
on the wall, and showed no with- 
drawal from pinprick or painful heat 
stimuli. Ziegler wrote to Ford for a 
prognosis (Ford & Wilkins, 1938) : one 
patient sustained a painless fracture 
of the leg in a football game; another 
patient died of a ruptured appendix 
and painless peritonitis. 

Of McMurray's (1950) patient who 
was autopsied, Baxter and Olszewski 
(1960) wrote: "Her lack of pain ap- 
preciation was so great that she 
suffered extensive skin and bone 
trauma which contributed in a direct 
fashion to her death" (p. 392). Yet 
this patient was bright and educated, 
her father was a physician, and she 
was in the habit of using her other 
sense modalities to check for other- 
wise unsuspected cuts, burns, frac- 
tures, etc. Despite these precautions 
her trophic disturbances were suffi- 
ciently severe to result in death. 

Not all such patients are so suscep- 
tible: our cases of hereditary insensi- 
tivity to pain included numerous in- 

juries which healed rapidly and well 
(Ervin & Sternbach, 1960). Never- 
theless, the mother, a hypertensive, 
was near death once from eclampsia 
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at childbirth because she failed to 
recognize its symptoms (she had no 
headache). Because she herself is 
pain-free, she is more than usually 
sensitive to signs of disease and in- 
jury in her seven children, and has 
been able to prevent serious damage 
from developing from minor trau- 
mata. One child, indeed, suffered 
painless appendicitis with peritonitis 
as in Ford's case (above), but was 
saved by the mother’s prompt reac- 
tion to his casual remark about a 
“stiff stomach." (An incidental bene- 
fit of the syndrome was enjoyed by 
the mother’s aunt, who died of exten- 
sive carcinomatosis but felt no pain, 
only some discomfort in the terminal 
week.) 

It is our impression that most pa- 
tients, in our probable category, if 
they survive childhood, have learned 
to rely on other cues. The tinglings, 
ticklings, and itchings serve as warn- 
ings of potential tissue damage and 
the patients shift position and attend 
to the area stimulated. This notion is 
supported, in the papers cited, by the 
fact that injuries and burns are rather 
rare in the adults studied, but com- 
mon in their childhood. Though chil- 
dren are more active, of course, never- 
theless there seems to be an increased 
awareness with age of the alternative 
cues. In the family studied, the 
younger children submitted to experi- 
mental manipulation with passive 
interest; the adolescents, however, 
voiced concern over possible injury 
and had to be reassured that we 
would not actually break their fingers, 
etc. 

.. Weddell (see above) suspects that 
if the sense of pain is eliminated then 
warmth and cold are also likely not to 
be felt, and that other cues are used 
to distinguish temperatures, such as 
the wetness of test tubes if they are 
used for testing, or convection cur- 
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rents of air. Taking care to avoid 
such cues in the family we studied we 
found some impairment of difference 
thresholds for temperature, but with- 
out these precautions their difference 
thresholds seemed normal. We cite 
this to make the point that such per- 
sons can, and often do, utilize other 
cues in making their adjustment. 
However, if the congenitally pain- 
free cannot or do not attend to these 
other cues, then, as we have seen, 
severe tissue damage, and possibly 
death, may ensue. The oft-mentioned 
adage that a sense of pain is necessary 
for survival seems, in the main, to be 
true. 


PERSONALITY DEVELOPMENT 
WITHOUT PAIN 


Whereas the chances for healthy 
survival are greatly impaired in the 
absence of a sense of pain, no such 
generalization seems possible about 
the development of a “normal” or 
“healthy” personality. We have indi- 
cated above that Julião and Brotto 
(1955) reported that their case was a 
behavior problem. This israre. Their 
patient was a 31-year-old boy whose 
mental age when tested was 3 years 3 
months. He liked to play with fire, 
received frequent blisters and burns 
but laughed at them, liked to hit 
with his head, pulled out his teeth, 
etc. His parents discovered that the 
only way to punish him was to douse 
him with very cold water, even a few 
drops often sufficing. 

On the other hand, Schachter 
(1956) described a boy of 3 years 10 
months, the youngest of three chil- 
dren, who was brought in by the 
parents as a "behavior problem" at 
the urging of neighbors. He had the 
typical history of painless injuries 
and burns. On examination all re- 
flexes were intact (except for no re- 
sponse to pressure or pinprick), his 
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physical development appeared nor- 
mal with no trophic disturbances, and 
his use of language and responses to 
psychological examination appeared 
normal. But Schachter's judgment 
was that the child was spoiled and 
overprotected: like many a last-born, 
he was treated as the "baby" of the 
family. lt appears that the behavior 
problem, as is sometimes the case 
with neurologically normal children 
as well, may have resided in the 
parents. 

There are of course important 
cultural differences and value judg- 
ments that must enter any diagnosis 
of personality. The exuberance and 
activity of young persons is pleasant 
to some adults and unpleasant to 
others. Wherea child's activity is not 
self-limiting due to the absence of 
negative feedback in the form of 
pain, the threshold for annoyance at 
the child's behavior may more fre- 
quently be crossed, and the child is 
more likely to be judged a problem. 
Perhaps the pain-free child tends to 
be more active than others because 
he literally has not found his physical 
limits; this can aggravate the tend- 
ency to develop trophic disturbances 
and implies a need for the parents of 
such children to set firm restrictions 
on the child's behavior, without im- 
plications for the goodness or badness 
of the child or his actions. For ex- 
ample, Westlake's (1952) “shy carrot- 
haired” girl of 6, a hyperactive and 
restless child who spent her time run- 
ning, climbing, and jumping, de- 
veloped bony lesions of the feet which 
healed only with a period of strict rest 
in bed. Apart from hyperactivity, 
the other children cited either are not - 
described with respect to personality 
development, or are reported as have 
ing normal personality features. 

Miss C., as noted above, had a 
severe pain deficit, but after thorough 
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personality testing she was judged 
emotionally healthy. Cohen et al. 
(1955) could detect no pathology in 
their 19-year-old coed from the 
Rorschach, Thematic Apperception 
Test, Minnesota Multiphasic Inven- 
tory, Lorr Anxiety Ratings, Taylor 
Manifest Anxiety scale, etc. This 
young lady demonstrated great skill 
with language (her Wechsler-Belevue 
IQ was more than 145) and denied 
the necessity of emotional experiences 
or of emotional sensitivity for the 
manipulation of sensitive language. 
Her examiners thought she displayed 
a somewhat ‘‘flat’’ emotional response 
in interviews, but noted that when 
she was engaged in conversation she 
became more animated. These two 
young ladies received the most exten- 
sive battery of diagnostic procedures 
of all the cases cited, and both were 
judged normal. 

The author has known for several 
years the family on whom we have 
reported, and these persons too ap- 
pear quite healthy. The children do 
very well in school, the older ones do 
part-time work, they engage in group 
activities (Boy Scouts, Girl Scouts), 
enjoy family activities, and in gen- 
eral appear to be happy and well- 
adjusted. 

It would seem, then, that despite 
rare exceptions, the absence of nor- 
mal pain sensitivity has little effect 
on normal personality development. 
Children appear easily able to in- 
corporate the cultural norms trans- 
mitted by their parents without the 
need for physical punishment. 
Taboos and lesser restrictions seem to 
be learned readily (by those of aver- 
age and above-average intelligence) 
from subtler forms of parental dis- 

approval than spankings, and normal 
affectivity seems not to be retarded. 
As McMurray (1955) has observed, 


these findings imply the necessity for 
considerable revision in those derived- 
drive theories of anxiety which rest 
on the early experience of pain as the 
unconditional stimulus. 


SuMMARY 


All cases of congenital insensitivity 
to pain that could be found in the 
literature have been reviewed and 
placed in three categories: Group I, 
"certain," for those who meet the 
criteria of congenital, generalized 
insensitivity to pain with little or no 
other defect, who have reached ado- 
lescence or adulthood and have never 
experienced pain; Group II, ‘‘prob- 
able," for cases of children and for 
those older patients who (very rarely) 
did experience pain; and Group III, 
"unlikely," for those cases in which 
loss of pain sense is acquired as a 
symptom of, or sequel to, some dis- 
ease or injury, or is associated with 
other physical or mental defects, or is 
incomplete as evidenced by the oc- 
currence of visceral pains. 

No case could meet the criteria for 
Group I. The 17 cases which fell into 
Group II were found to be markedly 
heterogeneous with respect to neuro- 
logical and behavioral signs, and it is 
argued that there is no single syn- 
drome of congenital insensitivity to 
pain since the nature or locus of the 
neural deficit must vary considerably 
among these persons. The possible 
kinds of neural defects are discussed. 

As is to be expected, the ability to 
survive is seriously impaired and 
depends in large measure on their 
ability to utilize other sensory cues of 
actual or potential tissue damage. 
Personality, however, is rarely af- 
fected by this sensory deficit, and it 
is judged that pain is not a necessary 
component in its normal develop- 
ment. 


INSENSITIVITY TO PAIN 


REFERENCES 


Arsuss, D. L, Cantor, M. B., & BARENBERG, 
P. A. Congenital indifference to pain. y 
Pediat., 1949, 35, 221-226. 

BARBER, T. X. Toward a theory of pain: Relief 
of chronic pain by prefrontal leucotomy, 
opiates, placebos, and hypnosis, Psychol. 
Bull., 1959, 56, 430-460. 

Baxter, D. W., & OLSZEWSKI, J. Congenital 
universal insensitivity to pain. Brain, 1960, 
83, 381-393. 

BERKLEY, H. J. Two cases of general cu- 
taneous and sensory anaesthesia, without 
marked psychical implication. Brain, 1891, 
14, 441-464. 

BERKLEY, H. J. The pathological findings in 
a case of general cutaneous and sensory 
anaesthesia without psychical implications. 
Brain, 1900, 23, 111-138. 

Boyp, D. A., JR, & Nix, L. W. Congenital 
universal indifference to pain. Arch. Neurol. 
Psychiat., Chicago, 1949, 61, 402-412. 

Burr, C. W. Two cases of general anesthesia. 
U. med. Mag., U. Pa., 1900-01, 13, 245-251. 

Carezzano, P. II sintomo del Lombroso 
come carattere familiare. Arch. Antropol. 
Criminol., Torino, 1928, 48, 827-838. 

CERNY-WALDVOGEL, M. Zur Frage der kon- 
genitalen, generalisierten Schmerzindiffe- 
renz an Hand eines klinischen Falles. [Con- 
genital universal indifference to pain (c. u. 
i. t. p.) based on a clinical case] Ann. 
Paediat., 1952, 178, 65-82. 

Conen, L. D., Krenis, D., KUNKLE, E.C,& 
KunzANSKY, P. E. Observations of a person 
with congenital insensitivity to pain. J. 
abnorm. soc. Psychol., 1955, 51, 333-338. 

Cours, W. F., & RANDT, C. T. Evoked cen- 
tral nervous system activity relating to 
peripheral unmyelinated or *C" fibers in 
cat. J. Neurophysiol., 1958, 21, 345-352. 

Coruiss, W. F., & RANDT, C. T. Midbrain 
evoked responses relating to peripheral un- 
myelinated or “C” fibers in cat. J. Neuro- 
physiol., 1960, 23, 47-53. 

Couston, T. A. Indifference to pain in low- 
grade mental defectives. Brit. med. Va 
1954, 1, 1128-1129. 

CnrrCHLEY, M. Some aspects of pain. Brit. 
med. J., 1934, 2, 891-896. 

Critcutey, M. Congenital indifference to 
pain. Ann. intern. Med., 1956, 45, 737-747. 

DrARBORN, G. V. N. A case of congenital 
general pure analgesia. J. nerv. ment. Dis., 
1932, 75, 612-615. 

Denny-Brown, D. Hereditary sensory ra- 
dicular neuropathy. J. Neurol. Neurosurg. 
Psychiat., 1951, 14, 237-252. 


263 


Duranp, P., & Beorn, B. M. Un caso di 
indifferenza congenita al dolore—“algo- 
atarossia": Primo contributo della lettera- 
tura Italian. Helv. paediat. Acta, 1957, 12, 
116-126. 

Ervin, F. R., & STERNBACH, R. A. Hereditary 
insensitivity to pain. Trans. Amer. Neurol. 
Ass., 1960, 85, 70-74. 

FANCONI, G., & FERRAZZINI, F. Kongenitale 
Analgie (Kongenitale generalisierte 
Schmerzindifferenz). Helv. paediat. Acta, 
1957, 12, 79-115. 

Farquuar, H. G., & SUTTON, T. Congenital 
indifference to pain. Lancet, 1951, 1, 827- 
828. 

FriNDEL, W. Note on the nerve endings ina 
subject with arthropathy and congenital 
absence of pain. J. bone jt. Surg., 1935, 
35(B, 3), 402-407. 

Forn, F. R., & WirxiNs, L. Congenital uni- 
versal insensitiveness to pain. Bull. Johns 
Hopkins Hosp., 1938, 62, 448-466. 

Grrarp, P.-F., Devic, M., & Garin, A. A 
propos d'une observation nouvelle d'indif- 
férence congénitale universelle à la douleur. 
Rev. neurol., 1953, 88, 198-201. 

HEMPHILL, R. E., & STENGEL, E. A study on 
pure word-deafness. J. Neurol. Psychiat., 
1940, 3, 251-262. 

Heyne, M. Ueber einen Fall von allgemeiner 
cutaner und  sensorisches Aniesthesia. 
Disch. Arch. klin. Med., 1890, 47, 75-88. 

Houzer. Ein merkwürdiger Fall von Anaes- 
thesia und Analgesia totalis. Wien. med. 
Bl., 1896, 19, 163. 

Jéourer, M., & Deter, M. L'indifférence 
congénitale à la douleur. Confin. neurol., 
Basel, 1956, 16, 207-215. 

Jewespury, E. C. O. Insensitivity to pain. 
Brain, 1951, 74, 336-353. 

Jurrao, O. F., & Bnorro, W. Indiferença 
congénita, generalizada, à dor.  Arqu. 
Neuropsiquiat., Sao Paulo, 1955, 13, 338- 


342. 

Keizer, D. P. R. Congenital indifference to 
pain. Lancet, 1951, 1, 1020. 

Kirxis, D. M., COHEN, L. D., KUBZANSKY, 
P. E, & KUNKLE, E. C. Experimental 
studies on insensitivity to pain: Physiologic 
and psychologic observations. Trans. Amer. 
Neurol. Ass., 1954, 79, 105-110. 

KuxkLE, E. C., & CHAPMAN, W. P. Insensi- 
tivity to pain in man. Res. Publ. Ass. 
Nerv. Ment. Dis., 1943, 23, 100-109. 

Lamy, M., Garciy, R., Jammer, M.-L., 
AUSSANNAIRE, M., LAMBERT, A, Tu- 
RIEZ, H., & GRASSET, A. Analgésie gén- 
éralisée congénitale. Sem. Hop. Paris, 
1956, 32, 2823. 


264 


Lets, P. P., Wennett, G., & WitLIAMS, 
C. M. The relationship between heat trans- 
fer, skin temperature and cutaneous sensi- 
bility. J. Physiol., 1954, 126, 206-234. 

Loewenstetn, W. R. On the “specificity” of 
a sensory receptor. J. Neurophysiol., 1961, 
24, 150-158. 

McMurray, G. A. Experimental study of a 
case of insensitivity to pain. Arch. Neurol. 
Psychiat., Chicago, 1950, 64, 650-667. 

McMurray, G. A. Congenital insensitivity 
to pain and its implications for motiva- 
tional theory. Canad. J. Psychol., 1955, 9, 
121-131. 

Maponick, M. J. Congenital insensitiveness 
to pain. J. mers. ment. Dis., 1954, 120, 87- 
88. (a) 

Maponick, M. J. Insensitiveness to pain. 
Neurology, 1954, 4, 554-557. (b) 

Macer, K. R., SCHNEIDER, S. F., & ROSEN- 
ZWEIG, N. Congenital indifference to pain. 
J. nerv. ment, Dis., 1961, 132, 249-259. 

MANDELL, A. J., & Surrg, C. K. Hereditary 
sensory radicular neuropathy. Neurology, 

1960, 10, 627-630. 


Mark, V. H., Ervin, F. R., & Hackett, 
T. P. Clinical aspects of stereotactic thal- 
amotomy in the human: Part I. The treat- 
ment of chronic severe pain. Arch. Neurol., 
Chicago, 1960, 3, 351-367. 

Metzack, R., SroTLER, W. A., & LIVINGSTON, 
W. K. Effects of discrete brainstem lesions 
in cats on perception of noxious stimulation. 
J. Neurophysiol., 1958, 21, 353-367. 

Morrie, D. —— universal indifference 
to pain and temperature. Confim. neurol. 
Basel, 1951, 11, 219-226. [ 

Morrie, D. Een geval van congenitale anal- 

Ned. Tijdschr. geneesk., 1952, 96, 
1177-1178. 

Munro, M. Sensory radicular neuropathy in 
a deaf child. Brit. med. J., 1956, 1, 541-544. 

Nisster, K., & Parnitzke, K. H. Fehlen der 
Schmerzempfindung bei einem Kinde. 
Dtsch. med. Wschr., 1951, 76, 861-863. 

OGDEN, T. E., ROBERT, F., & CARMICHAEL, 


E. A. Some sensory syndromes in children: 


RICHARD A. STERNBACH 


Indifference to pain and sensory neuro- 
pathy. J. Neurol. Neurosurg. Psychiat., 
1959, 22, 267-276. 

ORTIZ DE ZARATE, J.C. Analgésie généralisée 
congénitale, Encephale, 1955, 44, 414—426. 

Panis, & Larrorcue, Un cas d'anesthésie 
généralisée. Gas. kop., 1909, 82, 1583-1587. 

Paxxs, H., & SrAPLES, O. S. Two cases of 
Morvan's syndrome of uncertain cause. 
Arch. intern. Med., 1945, 75, 75-81. 

Petrie, J. G. A case of progressive joint 
disorders caused by insensitivity to pain. 
J. bone jt. Surg., 1953, 35(B, 3), 399-401. 

Rog, W. Congenital indifference to pain. 
Proc. Roy. Soc. Med., 1950, 43, 250. 

Rose, G. K. Arthropathy of the ankle in con- 
genital indifference to pain. J. bone jt. 
Surg., 1953, 35(B, 3), 408-410. 

Russ, J. L., & FrrepMan, E. D. Asymbolia 
for pain. Arch. Neurol. Psychiat., Chicago, 
1948, 60, 554-573. 

ScmacurER, M.  L'indifférence congénitale, 
universelle, à la douleur: A propos d'une 
observation infantile. Praxis, 1956, 45, 
681-684. 

ScHILDER, P., & STENGEL, E. Asymbolia for 
pain. Arch. Neurol. Psychiat., Chicago, 1931, 
25, 598-600. 

SCHNEIDER, S. F. A psychological basis for 
indifference to pain.  Psychosom. Med., 
1962, 24, 119-132. 

VEGA, C. A. Un caso de indiferencia con- 
génita dolor, de carácter familiar. Bol. méd. 
Hosp. infant., Mexico, 1949, 6, 638-648. 

WALKER, C. H. M. Sensory radicular neurop- 
athy: Report on two cases. Great Ormond 
St. J., 1955-56, No. 10, 72-80. 

Watt, P. D. Cord cells responding to touch, 
damage, and temperature of skin. J. 
Neurophysiol., 1960, 23, 197-210. 

WESTLAKE, E. K. Congenital indifference to 
pain. Brit. med. J., 1952, 1, 144. 

ZiEMSSEN, H. Allgemeine cutane und sensor- 
ische Anästhesie. Dtsch. Arch. klin. Med., 
1890, 47, 89-102. 


(Received September 1, 1961) 


| 


[ 
| 
1 


P: Beetio 
ca Ne. 3145 238 


SHAPE CONSTANCY: 
FUNCTIONAL RELATIONSHIPS AND THEORETICAL 
FORMULATIONS! 


WILLIAM EPSTEIN axp JOHN N. PARK 
University of Kansas 


» concerning shape constancy are 


urrence of compromise, conditions 


of observation, degree of orientation, observation attitude, familiarity 


When a form is projected by light 
on the retina, the differing orienta- 
tions of the form with regard to the 
retina result in a set of different pro- 
jective shapes. Under most condi- 
tions phenomenal shape is less af- 
fected by the orientation of the stimu- 
lus object with respect to the observer 
(O) than would be expected on the 
basis of the projective transforma- 
tions which accompany variations in 
orientation. The term “shape con- 
stancy” has been introduced to desig- 
nate this fact. Shape constancy is 
defined usually as the relative con- 
stancy of the perceived shape of an 
object despite variations in its orien- 
tation. This definition reflects the 
prevalent interest in the stability of 
the perceptual world. However, it is 
also possible to locate shape con- 
stancy within a wider range of events 
all characterized by a relative inde- 
pendence of perceived shape from 
retinal, projective shape. With this 


_in mind, the phenomena relevant to 


1 This study was supported by grants to the 
first author from the National Institute of 
Mental Health of the United States Public 
Health Service (M-4153) and the 
Research Fund of the University of Kansas. 


this paper can be placed into two 
main classes: 

Class 1. Under certain conditions 
projective shapes which are dis- 
cernibly different yield similar per- 
ceived shapes. 

Class 2. Under certain conditions 
projective shapes which are identical 
yield different perceived shapes. 

Since Class 2 is mentioned infre- 
quently in the literature, an example 
is in order. An ellipse with a minor- 
major axis ratio of 15:20 cm. pre- 
sented at 45° from the line of regard. 
will produce the same projective 
shape as a frontal-parallel ellipse 
with a 10.7:20 cm. axis ratio or a 
circle at 15° 13’ from the line of regard. 
With normal, unimpeded observa- 
tion, the three stimuli are easily dis- 
criminated as being different shapes 
despite the identity of their projec- 
tive shapes. 

The general plan of this paper is as 
follows: (a) A survey of the empirical 
findings concerning shape constancy 
is presented first? (b) This is fol- 

3 A number of publications concerned with 
shape constancy have been authored by 
Japanese investigators. Unfortunately, for the 

t writers, all but a few of these 
have been written in Japanese. However, 
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lowed by a discussion of several pos- 
sible explanations of shape constancy 
with special attention devoted to the 
shape-slant invariance hypothesis. 
(c) In the final section some methodo- 
logical considerations regarding the 
experiments in this area are presented. 


SunvEY OF EMPIRICAL FINDINGS 
CONCERNING SHAPE CONSTANCY 


The Occurrence of Compromise 


All studies have concurred in the 
finding that apparent shape does not 
correspond with the objective dimen- 
sions of a slanted standard. Under 
optimal conditions of observation 
perceived shape will be intermediate 
between the objective and projective 
dimensions of the standard. The 
term 'compromise" was introduced 
by Thouless (1931a) to describe this 
result. However, it should be recalled 
that the term describes only one in- 
stance of a more general class of shape 
perceptions which have the following 
common characteristic: the dimen- 
sions of the perceived shape cannot be 
precisely predicted from knowledge 
of either the projective shape or the 
objective shape. This latter state- 
ment takes cognizance of the fact 
thatin some instances apparent shape 
is not intermediate between the ob- 
jective and projective shape. On oc- 
casion the dimensions of the per- 
ceived shape, i.e., the comparison 
match, exceed the objective dimen- 
sions or fall short of the projective 
dimensions. 


Conditions of Observation 


Several investigators have shown 
that shape constancy is diminished 
by conditions which reduce the avail- 


most of these have been reviewed by Akishige 
(1958, pp. 147-149) in an article written in 
English and also by Okada (1961). In view of 
the indirect nature of our acquaintance with 
this work we have elected to omit these 
studies from our review, 
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ability or effectiveness of perceptual 
cues to the orientation of the object 
(e.g., Eissler, 1933; Langdon, 1951, 
1953, 1955b; Leibowitz, Bussey, & 
McGuire, 1957; Nelson & Bartley, 
1956; Stavrianos, 1945; Thouless, 
1931a; Yensen, 1955). There is some 
evidence that the effect of eliminating 
binocular cues on the judgment of 
shape will vary depending on the 
angle of inclination at which the 
standard stimulus is presented (Stav- 
rianos, 1945, p. 55). 

Various techniques have been em- 
ployed for manipulating the avail- 
ability of cues. Among the earlier 
methods are the restriction of ob- 
servation to monocular viewing or 
squinting, and the gradual narrowing 
of the field of vision to the stimulus 
objects alone. A procedure intro- 
duced more recently involves the 
elimination of discernible surface 
texture, 


Degree of Orientation 


All of the experiments which have 
dealt with this variable indicate that 
the amount of constancy expressed in 
terms of Brunswik or Thouless ratios, 
or various other indices, does not 
remain constant over the arc of slant. 

Eissler’s (1933) results for six 
trained Os showed that constancy 
decreased as the angle of rotation 
from the frontal-parallel plane in- 
creased. This finding has also been 
reported by Sheehan (1938), Eissler 
also expressed his results in terms of 
“transformation” or amount of com- 
pensation which was given by the 
formula, a—p/p, where “a” is the 
match chosen by O and "p" is the 
projective shape. Transformation 
values increased with the angle of 
orientation. 

Lichte (1952) obtained a linear 
function between the Brunswik ratio 
and the angle of rotation of the stimu- 
lus object from. the frontal-parallel 
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plane. There was a regular decrease 
in the ratio with increases in the 
angle of rotation. This is in agree- 
ment with the results cited above. 
When Lichte replotted his data using 
the simpler measure, a —p; he found 
an asymptotic function with increas- 
ing angles of rotation. To explain 
this function Lichte (1952) suggested 
"that as the cues to the ‘non-normal’ 
orientation become stronger, more 
and more regression takes place, up to 
a limit set by the nature of the or- 
ganism" (p. 55). When the present 
writers computed the quantity, a—P, 
from the data of other investigators, 
only one set of results was found 
which corroborated Lichte's finding. 
Thouless’ (1931a) data indicated that 
the quantity, a—p, increases with 
increments in the angle of slant up to 
60°, changing little with further in- 
crements in slant. Instead of an 
asymptotic approach to a limit most 
of the other experiments revealed an 
increase in the quantity, a—P, fol- 
lowed by a decrease. The results of 
Nellis (1958) and Leibowitz et al. 
(1957) showed that the quantity, 
a—p, increases with increments in 
the angle of orientation up to 60°-70° 
and then decreases at more extreme 
degrees of slant. A plot of Moore's 
(1938) findings for a 10-inch straight 
line revealed that a — p increases with 
increments in slant up to 30° and then 
decreases for a slant of 35°. A differ- 
ent kind of problem for Lichte is pre- 
sented by Stavrianos' (1945) data, 
which indicated that the quantity, 
a—p, reaches a limit at an angle of 
45° under full-cue conditions, but 
continues to increase with increases in 
angle of slant up to 55° under reduced- 
cue conditions. If the limit for full- 
cue conditions is “set by the nature of 
the organism," as Lichte claims, why 
should the value of a—p increase 
beyond this limit when cues to slant 
are eliminated? 
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Some of the confusions which pre- 
vail in the investigation of the rela- 
tionship between shape constancy and 
angle of slant are exemplified in 
Lichte's study. Lichte confronted his 
Os with rectangular standard stimuli 
in the frontal-parallel plane. Each of 
the four standards was 5 inches high, 
and varied in width from 4.75 inches 
to 3.25 inches in .5-inch steps. The 
variable stimulus was a 5-inch square. 
The O was asked to rotate the vari- 
able stimulus “until it appeared equal 
in shape and width to the standard 
stimulus” (Lichte, 1952, p. 50). How- 
ever, in view of the unaltering physi- 
cal dimensions of the variable it was 
impossible for O to match the stand- 
ard’s physical shape. Nor was it pos- 
sible to achieve identity of projec- 
tive or phenomenal shape since any 
rotation of the variable produced a 
trapezoidal projective shape whose 
phenomenal shape may be assumed 
to have been trapezoidal also. The 
only alternative left to O was to 
match the apparent width of the 
turned variable which was phe- 
nomenally trapezoidal with the width 
of the standard which was phenom- 
enally rectangular. This task is not 
entirely appropriate for a study of 
shape constancy. 

In addition, the finding that the 
Brunswik ratio decreased with angle 
of rotation is an artifact of Lichte’s 
peculiar assignment of values. In 
computing the Brunswik ratio, Lichte 
used the projective width of the vari- 
able setting as p, the objective width 
of the variable as 7 (real or physical 
value), and the objective width of the 
standard asa. To illustrate the arti- 
factual nature of Lichte's "finding" 
suppose that the two widths appear 
equal when the projective width of 
the variable is 10% less than the pro- 
jective width of the standard, ie. a 
constant error of underestimation 
occurs. Bearing in mind the nature of 
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the task and the assignment of values, 
then a reduction in the objective 
width of the standard will have three 
consequences: angle of rotation will 
be increased, the denominator of the 
Brunswik ratio will increase, and the 
numerator of the ratio will decrease. 
This means that the Brunswik ratio 
must decrease as angle of rotation in- 
creases; exactly what Lichte reported. 
However, this is a mathematico- 
experimental artifact and not a find- 
ing about constancy. A similar ob- 
jection may be directed to Eissler's 
finding regarding the amount of com- 
pensation and angle of orientation. 
The results obtained by Thouless 
(1931a) and Langdon (1953) are the 
reverse of those reported by Eissler, 
Sheehan, and Lichte. The main ob- 
jective of Langdon's (1953, Experi- 
ment II) study was to investigate the 
presumed changes in constancy over 
the arc of inclination "under condi- 
tions which make it a reasonable 
assumption that the shape undergoing 
tilt continues to be perceived as 
physically unchanged" (p. 93). These 
conditions were achieved by oscillat- 
ing the standard circular shape (solid 
or wire outline) continuously through 
an arc of 90? and obtaining judg- 
ments of shape during the oscillation. 
The comparison Shapes were ellipses 
(solid shapes or wire outlines) in the 
frontal-parallel plane representing 
projections of the circle at various 
points on the arc of oscillation rang- 
ing from frontal-parallel to near the 
line of regard. The O's task was to 
indicate when the oscillating circle 
and the ellipse appeared most similar 
in shape. The results “show an ex- 
tremely high constancy toward the 
line of regard falling to a low point 
around 60°-50°, rising slightly there- 
after and then declining once more as 
the frontal-parallel plane is ap- 
proached" (Langdon, 1953, p. 102). 
Comparable results were obtained in 
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a later experiment (Langdon, 1955b) 
using new points on the arc of inclina- 
tion, intermediate between those used 
in thefirst study. However, one reser- 
vation must be expressed about 
Langdon's finding. We were unable 
to determine how Langdon arrived at 
the Thouless values reported in his 
papers. The contents of Langdon's 
Table VI (1955b, p. 25) should be 
sufficient for this task. Yet there does 
not seem to be any assignment of 
these data which will yield the mean 
values of constancy whicharereported 
by Langdon. Our doubts on this 
matter are reinforced by the observa- 
tion that the constancy values re- 
ported in an earlier experiment(Lang- 
don, 1953, Experiment I, Table I, p. 
95) are in error. 

Clouding the picture further are 
the results of Moore (1938), Stav- 
rianos (1945), and Leibowitz et al. 
(1957). Moore (1938) found that the 
Brunswik ratio decreased from .58 to 
-51 when the angle of slant increased 
from 20° to 25°. However, there was 
a decrease of only .03 as the angle 
was increased in steps of 5°-40°. 
Brunswik ratios obtained by Stav- 
rianos (1945) for four angles of in- 
clination ranging from 15° to 55? 
showed that some Os exhibited in- 
creased constancy with increased tilt 
while others showed the opposite 
trend.  Stavrianos suggested that 
"individual differences may be re- 
sponsible for the discrepancy between 
the findings of Thouless and those of 
Eissler with regard to the effect of 
increasing tilt on shape constancy” 
(p. 54). Brunswik ratios calculated 
by the writers from the data pre- 
sented by Leibowitz et al. (1957, p. 
659) showed that for binocular view- 
ing theamount of constancy remained 
unchanged through five angles of 
inclination ranging from the frontal- 
parallel plane to 66? and then de- 
creased as angle of inclination was 
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increased. For monocular viewing 
the amount of constancy increased 
with increases in angle of inclination 
up to 56°, remained unchanged at 
66?, and then decreased with further 
increases in angle. Finally, Nellis' 
(1958) curves show decreases in con- 
stancy as standard ellipses were 
turned from 30° to 75° from the 
frontal-parallel plane. However, the 
same Os showed ‘‘superconstancy” or 
“overcompensation” for the segment 
of the arc from 0° to about 30°. It 
should be added that Nellis did not 
treat her data in terms of constancy. 
Instead, she spoke of ‘‘compensa- 
tion" which was defined as the ratio, 
log (a/p) (Nellis, 1958, p. 44). It was 
found that compensation increased 
as the angle of slant of the standard 
increased. Comparable findings are 
reported by Nellis for standard el- 
lipses of different degrees of eccen- 
tricity, for slants on the horizontal 
and vertical axes, and for various 
slants of the background. 

The only conclusion which is war- 
ranted by this summary is that the 
precise function relating constancy to 
angle of orientation is yet to be deter- 
mined. It is not surprising that the 
results of experiments which differ 
along dimensions whose influence on 
apparent shape is unknown will fail to 
agree. In addition, the absence of a 
standard quantitative expression of 
constancy makes wide agreement un- 
likely. 


Observation Attitude 


Constancy is greatest when O as- 
sumes an objective attitude and at- 
tempts to report the actual physical 
shape of the standard. Klimpfinger 
(1933b) found that the adoption of an 
analytic, retinal-matching attitude 
may be as effective in reducing con- 
stancy as is the elimination of cues of 
orientation. These findings were con- 
firmed by Gottheil and Bitterman 
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(1951). A study by Angrist is also 
relevant. Angrist (1954) presented a 
white disc in a Dodge tachistoscope 
for .1 second, and asked O to judge 
its shape under instructions "to take 
the angle of regard . . . into account" 
(p. 34). She found that these instruc- 
tions enhanced constancy as com- 
pared with earlier uninstructed judg- 
ments by the same O, However, as 
Angrist noted, the effects of instruc- 
tions in her experiment were inex- 
tricably confounded with the effects 
of practice, and therefore her results 
are equivocal. A recent experiment 
by Epstein, Bontrager, and Park 
(1962) found an interaction of atti- 
tude with conditions of observation, 
While different attitudes affected 
shape constancy under conditions of 
unrestricted binocular vision, these 
same observation attitudes were in- 
effectual when the stimuli were 
viewed monocularly under reduced 
conditions. It might also be appropri- 
ate to point out that Klimpfinger 
(1933b) never compared the results 
obtained for different attitudes under 
identical conditions, and Gottheil 
and Bitterman (1951) instructed the 
same Os to assume different attitudes 
on successive occasions. In the study 
by Epstein et al., different Os were 
assigned to the different attitudinal 
conditions while all other conditions 
remained constant. 

Thouless (1932), Sheehan (1938), 
Nellis (1958), and Leibowitz, 
Waskow, Loeffler, and Glaser (1959) 
have provided indirect evidence of 
the effects of attitude under condi- 
tions of unrestricted binocular vision. 
Nellis (1958) found that 8-year-old 
and 10-year-old children showed less 
constancy than adults. She proposed 
that these differences “reflect atti- 
tudes which are predominant at the 
different ages" (p. 85). However, 
earlier results reported by Klimp- 
finger (1933a) are not consistent with 
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Nellis’ findings or interpretation. 
Klimpfinger found a regular increase 
in shape constancy from ages 3-14, 
which falls off rapidly and reaches 
the 9-year-old level for the adults 
(18-30 years), and drops to the 8- 
year-old level for adults 30-37 years 
of age. 

Thouless (1932) and Leibowitz 
et al. (1959) found an inverse correla- 
tion between intelligence and con- 
stancy. The latter authors suggested 
that this relationship is the result of 
different attitudes adopted by Os at 
the different levels of intelligence. 
The more intelligent Os are assumed 
to adopt an analytic attitude (result- 
ing in low constancy scores), while 
the less intelligent Os are presumed 
to adopt an objective attitude. 


Familiarity and Representativeness 

Geometrical forms often have im- 
portant properties which cannot be 
specified physicalistically. Among 
these properties are familiarity and 
representativeness. Familiarity is 
some function of the number of previ- 
ous exposures of the form. The rep- 
resentational character of the form is 
determined by the specific meaningful 
identity which is assigned to it. A 
nonsense form with an irregular, 
randomly curved contour is both un- 
familiar and nonrepresentational. A 
regular geometrical form—eg., a 
rectangle—is familiar from previous 
experience but is not necessarily 
representational. A rectangular play- 
ing card is both familiar and represen- 
tational, i.e., it has a specific, mean- 
ingful identity. 

The influence of familiarity on 
shape constancy has been studied by 
Borresen and Lichte (1962), Langdon 
(1953), Moore (1938), Nelson and 
Bartley (1956), and Thouless (1931b). 
Only Borresen and Lichte obtained 
evidence that constancy is a function 
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of familiarity. In this study the ir- 
regular nonsense forms whose shape 
was to be judged were first familiar- — 
ized. The familiarization procedure 
consisted of presenting the forms 
with varying frequencies at various 
angles of orientation. The Os were 
instructed to duplicate the shape of 
the standard when the standard “was 
considered as an object independent 
of its slant” (Borresen & Lichte, 1962, 
p. 94). A control group was not given 
familiarization training. Judgments 


of the shape of the five standards were _ 


obtained at two angles of orientation. 
Shape constancy was found to be an — 
increasing function of the frequency 
with which the shape was presented 
in the familiarization period. How- 
ever, the number of orientations pre- 
sented during familiarization was not 
a significant determinant of con- 
stancy. This latter finding is surpris- 
ing and should be examined further. 
We would expect that viewing the 
standard in various orientations 
would provide O with an index of the 
perspective transformations which 
the standard undergoes when dis- 
placed from the frontal-parallel plane. 
Such information should be of value 
in making shape judgments of the 
slanted standards. Another aspect of 
the experiment which warrants fur- 
ther study is the effect of instructions. 
The instructional injunction quoted 
above is ambiguous and may be inter- 
preted by O as requiring that orienta- 
tion be disregarded or conversely that 
orientation be taken into account. 

Further evidence which suggests a 
relationship betwen familiarity and 
constancy is provided by a study of 
the apparent shape of afterimages. 
Ohwaki (1957) has reported evidence 
that the shape of the afterimage is 
not an exact representation of the 
retinal stimulation. Ohwaki’s results 
suggest that the apparent shape may 
depend also upon O's inspection of 
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the stimulus or E's (experimenter's) 
verbal description, 

Studies of the influence of repre- 
sentativeness have not been reported. 
However, one experimental approach 
to this question might follow the lead 
supplied by McKennell (1960) in his 
study of apparent size, Under condi- 
tions of unrestricted viewing McKen- 
nell had O make three sets of size 
judgments: judgments from memory 
of the sizes of several representational 
objects, visual estimates of the sizes 
of the same representational objects, 
and visual estimates of the sizes of 
comparable white cardboard squares. 
The contribution of memory to the 
visual estimates of each representa- 
tional object was determined by com- 
putinga partial correlation of the form 
715; the contribution of visual cues 
was determined by computing a par- 
tial correlation of the form rss1. An 
analagous experiment on apparent 
shape would involve a correlational 
analysis of the visual judgments of 
representational shapes, identical 
nonrepresentational shapes and esti- 
mates of the representational shapes 
from memory. Precautions would be 
necessary to assure that the repre- 
sentative and nonrepresentative 
forms were equivalent in other re- 
spects relevant to shape or slant de- 
termination, e.g., presence of inner 
detail. 


Differences between Forms 


There are some data which suggest 
that the variable of form interacts 
with slant in determining the amount 
of constancy. Thus Beck and Gibson 
(1955) obtained differences between 
quadrilaterals and triangles. They 
found a significantly greater number 
of exceptions from the required slant- 
shape relationship for the quadri- 
lateral stimuli. Also Arnoult (1954), 
in a study of shape discriminations as 
a function of angular orientation, re- 


ported differences between two non- 
sense (nonrepresentational) forma, 
Finally, Moore (1938) reported that a 
slanted circle will show more con- 
stancy than a slanted line. However, 
no systematic study of this factor has 
been performed, 


Individual Differences and Individual 
Consistency 

A number of investigators have re- 
reported the existence of individual 
differences in constancy ( 
1953, 1955b; Lichte, 1952; ellis, 
1958; Sheehan, 1938; Thouless, 1932). 
Beveridge (1935-36) has reported 
racial differences in shape constancy, 
Among the factors which have been 
mentioned as influential in producing 
individual differences are: differences 
in attitude, shifts in apparent orien- 
tation, differences in sensory effi- 
ciency, practice effects, etc. 

All investigators have agreed in the 
observation that individual consist- 
ency in the degree of shape constancy 
demonstrated under identical or simi- 
lar conditions is very high. This find- 
ing has been reported by Thouless 
(1932), Sheehan (1938), Moore 
(1938), Weber (1939), Lichte (1952), 
and others. 


The Effects of the Background 


Several studies have been devoted 
to the perception of form in an un- 
structured field. The technique em- 
ployed was to present the target in an 
otherwise totally dark room (Lang- 
don, 1953, 1955b; Nelson & Bartley, 
1956; Thouless, 1931b). Under these 
conditions, perceived shape approxi- 
mates the requirements of retinal 
shape. : j 

The remaining studies have investi- 
gated the effect of special background 
conditions. In considering these ex- 
periments, we have omitted the very 
effective illusions which demonstrate 


a 


an influence of the "vector-field" on. 
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perceived shape (e.g., Campbell, 1937; 
Orbison, 1939). These studies are at 
best of marginal relevance since they 
deal with the appearance of drawings. 
Nellis (1958) has studied the per- 
ception of elliptical shapes which were 
slanted at different angles out of the 
frontal-parallel plane and mounted on 
a background which was slanted also. 
The main finding was that "shape- 
compensation" log (a/p) decreased 
progressively with increases in the 
slant of the background. Nellis re- 
ported that the decrease is greater 
when the standard and background 
were slanted in the horizontal plane as 
compared with slants from the verti- 
cal plane. In addition it was found 
that the influence of background slant 
increased as the angle of slant of the 
standard increased. 

Langdon (1955c) has also investi- 
gated the role of the spatial-surround- 
cues. Pairs of shapes, both stationary 
and rotating, were matched within a 
simulated Ames-type distorted room 
with a rotation of the frontal-parallel 
plane of 30°. The shapes were of two 
kinds. The first was calculated for 
normal (Euclidean) perceptual space, 
and the second pair was comparable 
only in apparent space. Langdon 
found that O could match an “el- 
lipse” and a “circle” presented in the 
windows of a distorted room, when 
the two stimuli had been constructed 
in such a way as to be comparable for 
"equivalent space" (the space of the 
distorted room). According to Lang- 
don, the ability of O to make such a 
match is evidence that shapes in the 
distorted room were seen in the di- 
mensions of equivalent space. Lang- 

don further maintained that O's 
ability to match an equivalent-space 
ellipse with an equivalent-space oscil- 
lating circle is evidence that the shape- 
inducing effects of the distorted room 
were more powerful than the tend- 
ency of movement to restore the true 
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shape of the stimuli. As reasonable as 
these conclusions may seem, they do 
not necessarily follow. It is quite 
possible that O can match the shapes 
of the equivalent-space ellipse and 
circle in the absence of the shape- 
inducing effects of the distorted room. 
Langdon's equivalent-space circle was 
actually an oblate ellipse, the left 
side of which is only very slightly 
wider than the right. As this oblate 
ellipse rotates clockwise away from 
the back wall of the distorted room, 
it projects the image of a prolate 
ellipse which becomes progressively 
narrower. Entirely aside from in- 
duced effects, we would expect O's 
perception to be intermediate be- 
tween the projective shape and the 
real shape. If the oblate ellipse is 
slanted far enough away from the 
frontal-parallel plane, this compro- 
mise perception ought to be of an 
ellipse which appears equal to the 
comparison ellipse. (Although the 
comparison ellipse was somewhat 
egg-shaped, it should be possible for O 
to make a rough match, if for no 
other reason than that the rotating 
oblate ellipse was also somewhat egg- 
shaped, its left side being somewhat 
longer and narrower than its right— 
just as was the case for the rotating 
ellipse.) There is, however, an even 
stronger reason for believing that 
Langdon has not demonstrated the 
shape-inducing effects of the dis- 
torted room. Langdon’s conclusions 
rest upon the assumption that the 
distorted-room effects are so strong 
that shapes calculated to be com- 
parable in their presence cannot be 
matched in their absence. If the 
shape-distorting effects are, in fact, 
this strong, then it should not be pos- 
sible in their presence to match shapes 
calculated for normal perceptual 
space. Yet Langdon found that O 
can match a normal stationary circle 
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with a.normal ellipse in the distorted 
room. 


Movement 


As noted above, when all cues 
stemming from the object and its 
surrounding field are eliminated, 
perceived shape approximates pro- 
jective shape. Under such conditions 
Langdon (1951, 1955b) has reported 
that a regular rotatory motion “‘is 
sufficient to restore constancy” (1951, 
p. 157). In a completely dark room 
Langdon (1951) presented two ob- 
jects which could be seen by fluores- 
cent coating. Theobjects were viewed 
successively with one eye. One object 
was a circular outline of wire which 
rotated mechanically on its vertical 
axis. The other object was one of 15 
elliptical outlines which represented 
various frontal-parallel projections 
of the circle. The elliptical outline 
was presented in the frontal-parallel 
plane. The O's task was to indicate 
when the rotating circle and the el- 
liptical shape appeared equal. The 
measure of constancy was the excess 
angle of orientation over and above 
that required to produce a frontal- 
parallel projection equal to the com- 
parison ellipse. Thus if the angular 
position of the circle was 49° at the 
time of apparent equality of shape 
with an ellipse equal to the frontal- 
parallel projection of a circle at 45°, 
then the degree of constancy is rep- 
resented by the fraction 4/45 or .09. 
In this particular instance the two 
shapes will have appeared equal when 
the projective width of the circle is 
somewhat narrower than the frontal- 
parallel ellipse. This means that the 
circular form appears less elliptical, 
i.e., wider, than its projective require- 
ments; a constancy-effect. Langdon 
found that constancy measured in 
this manner rose as a smooth linear 
function of increases in rotation speed 
up to an optimal velocity. Langdon’s 
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results have been stated in a concise 
manner by Yensen (1957): 


... the angle of inclination at which the sub- 
ject matches a rotating circle, viewed in dark 
space, to a given frontal plane ellipse, is 
greater than the angular match for stationary 
shapes under the same conditions, and... 
this angle increases with increases in the rate 
of rotation of the rotating circle (p. 130). 


Similar results have been reported by 
Langdon (1955a) for a specially con- 
structed “‘solid.” The solid was made 
to undergo progressive physical 
changes of shape while being com- 
pared with various stationary, two- 
dimensional projections under con- 
trolled conditions. Here again, the 
continuous movement and regular 
deformation of the shape resulted in 
more veridical perception. 

Langdon sought to explain his re- 
sults by noting that the stationary 
and rotating shapes have different 
"object-characteristics." ^ The sta- 
tionary shape appears insubstantial 
while “the intervention of motion 
.. . operates to ‘create’ the object as 
a real and subsisting entity" (Lang- 
don, 1951, p. 164). Ina later discus- 
sion Langdon (1955b) made a similar 
point, suggesting that the regular 
deformation produced by rotation 
endowed the moving shape with tri- 
dimensionality, and that this con- 
tributed to an enhancement of con- 
stancy. However, it should be noted 
that Langdon (1951) found that not 
all Os experienced tridimensionality 
and that differences in this respect did 
not “appear to affect their matching 
of the shapes” (p. 162). This observa- 
tion was repeated in another context 
(Langdon, 1953, p. 100). In addition, 
Langdon does not present a clear 
statement of the reasons for main- 
taining that constancy should be en- 
hanced when the standard has a solid 
appearance. A plausible explanation 
might be formulated on the basis of 
the considerations presented in Hoch- 
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berg's (1957) summary of the Cornell 
Symposium on Perception (see pp. 
79-81) and Gibson and Gibson’s 
(1957) work on slant and shape per- 
ception as a function of continuous 
perspective transformation. 

Yensen (1957) confirmed Langdon’s 
results. | However, he considered 
Langdon’s interpretation of the ex- 
perimental results to be incorrect. 
Yensen argued that Langdon’s find- 
ings should not be interpreted as a 
restoration of constancy.  Yensen's 
main objection stemmed from his 
results using as a frontal-parallel 
stimulus the "real" shape, i.e., a 
shape identical to the rotating stimu- 
lus (4 X4 inch square), set in a frontal- 
parallel position. Yensen found that 
in this case a rate of increase in 
angular setting of the rotating square 
occurred which washighly comparable 
to the situation in which the match 
was made to a given frontal-parallel 
projection of the real shape, i.e., toa 
shape representing the width of the 
real shape at some angle of slant. 
Yensen (1957) reasoned that ‘‘con- 
stancy factors could not be operative 
in matches to the ‘real’ shape and so 
would not appear to be responsible 
for the increasing trend in matches to 
the slanted shape" (p. 131) in Lang- 
don’s experiments. Yensen’s logic is 
unclear. The width of the frontal- 
parallel projection of the square de- 
creases as the square rotates away 
from the frontal-parallel plane. Con- 
stancy means that the square appears 
wider than its frontal-parallel projec- 
tion. The greater the constancy, the 
greater the angle of rotation at which 
the rotating square appears equal to 
the stationary square in the frontal- 
parallel plane. There is no a priori 
reason why this angle should not in- 
crease as the rate of rotation increases, 
Such an increase would affect mean 
angular settings by its effect on judg- 
ments made when the square is 
rotating from the line of sight toward 
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the frontal-parallel plane. The 
greater the constancy, the earlier 
apparent equality will occur; hence 
the greater the angle (measured from 
frontal-parallel) at which O will indi- 
cate equality. 

However, even if we accept Yen- 
sen’s reasoning the significance of his 
results regarding the real shape re- 
main open to serious question. There 
are two main objections: (z) The 
mean frontal-parallel plane projec- 
tive width of the rotating shape 
matched to the real shape did not 
differ significantly for the four rates 
of rotation. Table IV (Yensen, 1957, 
p. 133) shows that the widths varied 
from 3.99 inches to 3.97 inches. 
(The real shape was a 4X4 inch 
square in the frontal-parallel plane.) 
Thus at all rates of rotation there 
obtained an almost identical high 
degree of constancy. (b) There were 
many differences between Langdon's 
conditions and those established by 
Yensen. Most deservant of mention 
is the fact that while Langdon's 
frontal-parallel ^ comparisons  (el- 
lipses) represented various projective 
shapes of the rotating standard this 
was not so in Yensen's study. In 
Yensen's experiment the frontal- 
parallel shapes were rectangles; thus 
they could represent the various pro- 
jective widths only, but not the pro- 
jective shapes of the rotating square. 


Exposure Time and Intensity 


At least two studies (Leibowitz & 
Bourne, 1956; Leibowitz, Mitchell, & 
Angrist, 1954) have shown that ex- 
posure time may affect perceived 
shape. As exposure time was reduced 
from 1.0 seconds to .01 second, con- 
stancy was reduced for both a white 
disc and a half-dollar coin. An expo- 
sure of .01 second produced matches 
which corresponded with the pro- 
jective shape of the object. Similar 
results were obtained by Leibowitz 
and Bourne (1956) for variations in 
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luminance. The reciprocal relation 
between exposure time and intensity 
for very short exposure-durations 
(i.e., Bunsen-Roscoe law) suggests 
that some of the effect of exposure 
time might be due to the concomitant 
variations in intensity. However, 
Leibowitz and Bourne (1956, p. 280) 
presented evidence that exposure 
time has an effect on perceived shape 
in addition to its relationship to the 
total stimulus energy. In accounting 
for their findings the authors sur- 
mised that the effects of luminance 
“may be attributed to the impair- 
ment of acuity and intensity dis- 
crimination for ‘additional’ stimuli in 
the visual field” (Leibowitz &Bourne, 
1956, p. 280). The effects of reduc- 
tion in exposure time are similarly 
explained. Leibowtiz and Bourne's 
conclusion is recommended by the 
fact that the shortest exposure dura- 
tion and lowest luminance resulted in 
a high degree of correspondence be- 
tween judged shape and projective 
shape. This suggests that the reduc- 
tion operation diminished the effec- 
tiveness of the cues for slant, and not 
the effectiveness of the projective 
shape. Had the latter been the case, 
then great variability would have 
been observed in O’s matches. In- 
direct empirical support of Leibowitz 
and Bourne's interpretation may be 
found in Clark's (1953) study of the 
influence of exposure time on the 
perception of slant. With only the 
retinal gradient of texture density as 
a stimulus for slant very brief expo- 
sures of a surface slanted 37? from the 
frontal-parallel plane resulted in per- 
sistent underestimations of slant. 
Perceived slant ranged from 8.1? to 
14,9°. 


KNOWLEDGE AND PRAGNANZ AS 
EXPLANATIONS OF SHAPE 
CONSTANCY 


It has already been shown that 
there is little direct evidence that 


prior knowledge influences shape 
constancy. Here it need be added 
only that an account in terms of 
knowledge or assumptions about the 
stimulus situation would, to para- 
phrase Koffka (1935, pp. 87-96), on 
the one hand, explain too much, and 
on the other hand, explain too little. 
While introduction of prior knowl- 
edge or assumptions might have 
helped to explain complete veridical- 
ity, these factors cannot help predict 
the percept which is not determined 
entirely either by the distal or proxi- 
mal stimulus. In addition, this ex- 
planation could not account for the 
functional relationships described in 
the previous section. 

Representing the opposite theoreti- 
cal pole, the question might be asked 
whether shape constancy can 
viewed as a product of the principle 
of Prügnanz, i.e., a presumed tend- 
ency to assimilate the slanted stand- 
ard to a more stable frontal-parallel 
representation. Perhaps a slanted © 
circle which produces an elliptical 
projective shape is assimilated to the 
more stable circular shape thus re- 
sulting in constancy. This interpeta- 
tion conceivably could receive sup- 
port from the observation that con- 
stancy appears to vary for differently 
shaped forms. However, a more 
parsimonious explanation of this 
finding might be made in terms of 
possible differences in the accuracy 
of apparent slant for different figures. 
For example, Clark, Smith, and 
Rabe (1956b) report findings which 
show that the slant of circles is more 
accurately perceived than the slant of 
rectangles. Stavrianos (1945, Experi- , 
ment II) also obtained differences in 
the apparent slant-objective slant 
relationship for rectangles as com- 
pared with ellipses. In any event, 
sufficient reason for doubting the 
validity of the Prägnanz hypothesis 
is provided by a simple experiment 
performed by Thouless (1931a). The 
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Os judged the shape of an elliptical 
standard which was so proportioned 
and so slanted that the retinal projec- 
tion was that of a circle. Thouless 
(19312) reported the following: 


It will be found that not only is there no tend- 
ency for phenomenal regression to diminish 
as perspective shape approaches circularity, 
but even that under these conditions the 
index [of phenomenal regression] was greater 
than with any other perspective shape (p. 
347). 


However, if a preference for the more 
stable figure was a critical determi- 
nant, no constancy should have been 
obtained. The O should have per- 
ceived a circle. A very similar dem- 
onstration was reported by Moore 
(1938) who found that a prolate el- 
lipse slanted to produce a circular 
projective shape showed as much 
constancy as a slanted circle. 


INVARIANCE HYPOTHESIS 


The first explicit formulation of the 
invariance hypothesis was made by 
Koffka (1935) in order to explain 
apparent exceptions to two principles 
which he believed to be basic to an 
understanding of perception. (a) The 
first of these principles is that “two 
proximal stimuli if more than limi- 
nally different cannot produce exactly 
the same effect” (p. 228). This ap- 
pears to be contradicted by the fact 
that under certain conditions projec- 
tive shapes which are discernibly 
different yield similar perceived 
shapes. (b) The second of Koffka's 
principles is that proximal stimulus 
situations which are the same must 
produce the same perceptual effects. 
This appears to be contradicted by 
the fact that under certain conditions 
projective shapes which are identical 
yield different perceived shapes. 
These apparent paradoxes may be 
resolved by noting that a perception 
produced by a given proximal stimu- 
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lus pattern has at least two different 
aspects, shape and orientation. Gib- 
son (1951) has put it this way: 


Perceiving a surface-form involves perceiving 
both the slant of the surface and the form of 
its edges; an impression of form is never ob- 
tained without some accompanying impres- 
sion of the angle at which the surface lies, 
either frontal or inclined. The problem of 
shape constancy, so-called, is better formu- 
lated as the problem of seeing shape-at-a- 
slant (p. 405). 


What is different in the percepts pro- 
duced by two different proximal 
stimulus patterns is the shape-slant 
combination and not necessarily the 
shape or slant alone. Thus if two 
different retinal patterns give rise to 
the perception of the same shape, it 
will be a shape perceived at two 
different degrees of slant. Conversely, 
what is invariant for a given retinal 
shape is not a given shape perception 
but a certain combination of apparent 
shape and apparent orientation. Thus 
if the same retinal pattern gives rise 
to perceptions of two different shapes, 
the accompanying impressions of 
slant will be such that the shape- 
slant relationship is invariant. For 
example, one percept produced by a 
given proximal stimulus situation 
might indicate an underestimation of 
slant and corresponding underesti- 
mation of the length of the fore- 
shortened axis of a slanted shape, 
while another percept produced by 
the same proximal stimulus situation 
would indicate an overestimation of 
slant and a corresponding overestima- 
tion of the foreshortened axis. These 
considerations have been summarized 
by Beck and Gibson (1955) in what 
may be designated as the "'shape- 
slant invariance hypothesis": “A 
retinal projection of a given form 
determines a unique relation of ap- 
parent shape to apparent slant” 
(p. 126). 
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Experimental Evidence concerning 
the Invariance Hypothesis 


The invariance hypothesis implies, 
that, for a retinal projection of a 
given form, a reduction in the ac- 
curacy of perceived slant will be ac- 
companied by a corresponding reduc- 
tion in the accuracy of perceived 
shape. When the slant of an object is 
correctly perceived, phenomenal 
shape should correspond most closely 
to objective shape; if O errs in his 
perception of orientation, this should 
be accompanied by deviations of 
phenomenal shape from objective 
shape. For instance, in an extreme 
case, a slanted object might appear 
to O to be in the frontal-parallel 
plane. In this event there should 
obtain an extreme discrepancy be- 
tween apparent shape and objective 
shape; i.e., the apparent shape should 
approximate closely the retinal pro- 
jection of the object. 


Decreased Constancy Accompanying 
Reduction of Cues to Slant 


A number of studies have shown 
that shape constancy is diminished by 
conditions that reduce the availabil- 
ity or effectiveness of cues to orienta- 
tion. 

Monocular Observation. Brunswik 
ratios calculated by the writers from 
the data presented by Leibowitz et al. 
(1957, p. 659) showed that under 
conditions of tachistoscopic exposure, 
constancy was consistently lower for 
monocular than for binocular obser- 
vation. Similar results for unre- 
stricted viewing time were reported 
by Thouless (1931a, 1931b), who 
found that the phenomenal shape of a 
disc lying on a table top became more 
elliptical when O switched from bin- 
ocular monocular observation. 

Elimination of Cues Provided by the 
Surroundings. In Thouless’ (1931a, 


1931b) experiment a high degree of 
constancy occurred even under the 
monocular condition because O could 
obtain slant cues from the relation- 
ship of the disc to the surface of the 
table. However, when the setting 
was darkened so that only the disc 
was visible, phenomenal shapeequaled 
retinal shape. Using luminous, sta- 
tionary outline shapes viewed mo- 
nocularly, Langdon (1951, 1955b) 
found that the mean constancy value 
shifted from .153 to less than .02 
when cues to slant emanating from 
the surroundings were eliminated by 
darkening the experimental room. 
Constancy values close to zero were 
obtained by Yensen (1957) for sta- 
tionary outline rectangles viewed 
through a reduction tunnel which 
restricted O’s view to the targets. 

Elimination or Reduction of Cues 
Provided by the Gradient of Texture. 
It has been demonstrated (e.g., 
Gibson, 1950a, 1950b) that the 
retinal gradient of texture is a stimu- 
lus-correlate for apparent slant. 
When Langdon (1953) eliminated 
the gradient of texture by employing 
as stimuli circular wire outlines in a 
fully lighted setting, he obtained a 
relatively low mean constancy value 
(.153). Yensen (1955, Experiment 
III) found that the apparent width of 
a standard slanted square viewed 
monocularly through a reduction 
tunnel was greater when the surface 
of the square had a determinate tex- 
ture (provided by randomly spaced, 
black dots on a white background) 
than when the surface was uniformly 
white and hence lacked any dis- 
cernible texture. 

Reduction of Exposure Time and 
Intensity. Leibowitz et al. (1954, 
1956) found that the phenomenal 
shape of a slanted disc progressively 
approached and finally equaled its 
projective shape as either exposure 
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time or luminance was reduced. We 
may presume that this outcome was 
the result of a progressive elimination 
of cues for slant. 


Slant and Shape Judgments Compared 


Because none of the above studies 
obtained judgments of slant, their 
findings, as they bear on the invari- 
ance hypothesis, are inconclusive. A 
number of experiments in which Q 
made both shape and slant judg- 
ments have revealed that conditions 
which reduce the accuracy of one of 
these kinds of judgment, e.g., slant, 
do not necessarily produce a corre- 
sponding reduction in the accuracy of 
the other kind of judgment, e.g., 
shape. 

For purposes of exposition we shall 
divide the experiments in which both 
shape and slant judgments were ob- 
tained into (a) those in which one of 
the two kinds of judgment is unreli- 
able and (b) those in which both kinds 
of judgment are reliable. 

Experiments in Which One of the 
Compared Judgments is Unreliable. 
These experiments in turn may be 
divided into those which contradict 
the invariance hypothesis and those 
which support it. 

1. Experiments which Contradict 
the Invariance Hypothesis—One of 
the earliest investigations (Eissler, 
1933) of the shape-slant relation- 
ship yielded paradoxical results. The 
standards were rectangles and ellipses 
rotated around their vertical axes to 
deviations of 30° and 60? from 
frontal-parallel. The comparisons 

were a series of frontal-parallel shapes 
which were presented by the method 
of constant stimuli. After making a 
series of shape judgments for a given 
standard, O was required to make a 
verbal judgment of its apparent slant, 
Shape and slant judgments were 
made under full-cue and reduction 
conditions. In accord with the in- 
variance hypothesis, shape constancy 
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decreased as cues to slant were elimi- 
nated. A mean Brunswik ratio of 
.136 for binocular observation de- 
creased to .473 for monocular view- 
ing. A similar reduction in con- 
stancy occurred when perspective 
cues and shadows were eliminated by 
having O observe the stimuli through 
half-closed eyes or tinted glasses. 
However, slant judgments did not 
match shape judgments as required 
by the invariance hypothesis. In 
some cases a slanted object was seen 
as frontal-parallel or as only “slightly 
turned” and yet with good constancy, 
and in other cases fairly accurate 
estimations of orientation were ac- 
companied by low constancy. Simi- 
lar results were obtained by Klimp- 
finger (1933a, 1933b). 

However, as has been noted by 
Stavrianos (1945) and Koffka (1935), 
any conclusions drawn from Eissler's 
results would be somewhat tenuous 
because (a) the evidence with regard 
to apparent slant rests on verbal re- 
ports made after each series of judg- 
ments rather than on quantifiable, 
contemporaneous judgments; (b) 
cases of accurate slant judgments 
without constancy were rare; and (c) 
more than a third of the cases of 
constancy without perception of non- 
normal orientation belonged to a 
one-eyed subject, whose results dif- 
fered in many ways from those of 
normal subjects. 

Somewhat more reliable findings 
have emerged from several recent 
studies. Haan and Bartley (1954) 
had O make binocular observations 
of three luminous outlines: a circle 
and two ellipses. These objects were 
presented one at a time in a totally 
dark field at a distance of 17} feet 
from O. Each outline was oriented at 
four different degrees of slant ranging 
from 0? to 67.5? away from vertical. 
Using a slant-board, O was able to 
reproduce the planes in which the 
standards lay with a fair degree of 
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accuracy. Nelson and Bartley (1956) 
reported that the same O, in the same 
experimental situation, produced 
drawings of the standards which were 
very similar in shape to their frontal- 
parallel projections. However, we 
cannot be sure that these drawings 
indicate inaccurate shape perception. 
It is possible that O saw each stand- 
ard as a shape-at-a-slant. When 
asked to draw this shape, O may 
have attempted to indicate per- 
ceived slant by foreshortening, i.e., 
by using the artist's device of repre- 
senting slant around the horizontal 
axis by drawing an ellipse with a 
shortened vertical axis. This possi- 
bility is given substance by the re- 
sults of an experiment by Clark, 
Rabe, and Smith (19562). 

Clark et al. (1956a) required O to 
adjust a pivoted rod to match the 
slant of each of a series of rectangles 
which were inclined 40? from frontal 
parallel. Since O was limited to mo- 
nocular vision with head motionless, 
it is not surprising that the mean 
perceived slants were much smaller 
than the objective slant. What is 
surprising is that verbal judgments 
indicated that the stimuli appeared 
to be rectangles rather than trape- 
zoids or any intermediate shape. 
Equally at odds with the invariance 
hypothesis were O's drawings of the 
standards as trapezoids such as would 
have been projected by rectangles 
slanted a few degrees more than 40°. 
The drawings can be reconciled with 
the verbal reports on the assumption 
that O was trying to indicate slant 
by foreshortening. In a second ex- 
periment employing circular stand- 
ards and permitting binocular as well 
as monocular viewing, Clark et al. 
(1956b) corroborated the paradoxical 
findings of their earlier study: al- 
though mean perceived slants were 
less than objective slant, all Os re- 
ported that they saw circles rather 
than ellipses. 


2. Experiments which Support the 
Invariance — Hypothesis—Qualified 
support for the invariance hypothesis 
is provided by Beck and Gibson 
(1955), who found that a nonveridical 
perception of slant was accompanied 
by a matching modification in the 
perception of shape. In order to 
eliminate all cues to slant except 
those provided by gradients of back- 
ground texture, the stimuli were pre- 
sented at a distance of 7 feet and O 
was limited to monocular vision with 
the head motionless. The standard 
was a triangle of indiscernible texture 
slanted outward from a roughly tex- 
tured vertical background at an 
angle of 45?. Verbal reports indi- 
cated that all Os saw the triangle as 
being in the same plane as its back- 
ground. Shape judgments were ob- 
tained by having O match the stand- 
ard triangle with one of two com- 
parison triangles mounted flat on the 
same background. One comparison 
had the same objective shape as the 
standard, while the other had the 
shape the standard would have if pro- 
jected on the background. As re- 
quired by the invariance hypothesis, 
all Os matched the standard with the 
comparison whose shape was its 
frontal-parallel projection. When 
stimulation for the slant of the stand- 
ard was introduced by permitting 
binocular vision, there was an ex- 
pected shift to the comparison which 
was objectively equal. Nevertheless, 
the projectively equal comparison 
continued to be selected in 23% of 
the cases. In several of the latter 
instances Os were asked to repro- 
duce the slant of the standard by 
adjusting a vertical plate. Since 
they were able to do so with some 
accuracy, it is evident that their 
nonveridical shape matches consti- 
tute an exception to the invariance 
hypothesis. Beck and Gibson's find- 
ings failed to support a precise 
statement of the invariance hypoth- 
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esis. The results indicated only that 
there was a tendency to an in- 
variant shape-slant relationship. 
Epstein, Bontrager, and Park 
(1962) extended the Beck and Gibson 
experiment by presenting the back- 
ground at 3* of slant and employing 
à comparison stimulus whose shape 
could be continuously varied. A 
more adequate measure of apparent 
slant was obtained by having each O 
rotate a circular disc to the same 
slant as the triangular target. For 
both moncular and binocular obser- 
vation, the results showed less ad- 
herence to the invariance hypothesis 
than did the results of Beck and 
Gibson. This may be attributed to 
the fact that Beck and Gibson forced 
O to choose between one of two ex- 
treme alternatives, i.e., objective or 
projective. Faced with such a 
choice, O may have selected the com- 
Parison object which was most like 
the apparent shape although neither 
comparison stimulus was judged to 
be the same as the standard. In the 
experiment by Epstein et al. the con- 
tinuously variable comparison en- 
abled O to make more sensitive dis- 
criminations. 

Additional evidence of a loose link- 
age between apparent shape and ap- 
parent slant was reported by Yensen 
(1955). In one study Yensen (1955, 
Experiment II) found that for the 
same actual slant the apparent width 
of a standard square was significantly 
greater when the standard appeared 
at a greater angle of slant than when 
it appeared at a lesser angle of slant 
(as a result of restricting observation 
to monocular viewing under low il- 
lumination). However, the confirma- 
tion of the invariance hypothesis 
must be qualified by the fact that 

some Os who reported the standard 
at 0? showed some degree of con- 
stancy, nevertheless. 

With the exception of the experi- 
ment by Epstein et al. (1962), all the 
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experiments summarized above suffer 
from the unreliability of either the 
shape or the slant judgment; one or 
the other judgment was obtained from 
a drawing or a verbal report or by 
means of a forced-choice technique. 

Experiments in Which Reliable 
Judgments of Shape and Slant Were 
Obtained. Reliable judgments of both 
shape and slant under the same ex- 
perimental conditions were obtained 
originally by Stavrianos (1945). In 
Stavrianos’ first experiment, two 
standard rectangles were presented 
at four angles of inclination under 
three reduction conditions: normal 
binocular vision, binocular vision 
with reduction tubes, and monocular 
vision with reduction tubes. The 
O's task was first to adjust the slant 
of a comparison rectangle (of dif- 
ferent dimensions than the standard) 
until its slant appeared equal to that 
of the standard. Then, under ''ob- 
jective” instructions, O adjusted the 
shape of a frontal-parallel trapezoid 
of fixed base (different from that of 
the standard) until it appeared to be 
the same shape as the standard. The 
slant and shape variable stimuli were 
always viewed with full binocular 
vision. The results failed to support 
the invariance hypothesis. 

1. For separate pairs of shape and 
slant judgments there was not a close 
relation between the deviations from 
the mean of slant judgments and the 
deviations from the mean of shape 
judgments. 

2. A comparison of mean constant 
errors for shape and slant judgments 
failed to reveal the expected con- 
comitant variation: (a) As depth cues 
were eliminated, there was an in- 
creasing underestimation of the slant 
of the standard, but the accuracy of 
shape judgments did not decrease. 
(b) Although significantly larger un- 
derestimations of slant occurred at 
intermediate angles of inclination of 
the standard, there was no general 
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ard to occur at those angles. 

3. Variability in slant judgments 
was greater under the monocular con- 
ditions as compared to the normal 
condition, and this difference in- 
creased as angle of inclination from 
frontal-parallel increased, However, 
no consistent trends with regard to 
variability were evident for shape. 

In Preliminary Experiments B and 
C, Stavrianos had found a constant 
error in shape and slant judgments 
which could be attributed to the in- 
equality of the absolute size of the 
standard and the variable. When the 
data of Experiment I were corrected 
for these constant errors, the expected 
relationship between apparent shape 
and apparent slant was found for 
some Os for the monocular condition. 
In order to provide additional in- 
formation about the shape-slant re- 
lationship under monocular condi- 
tions, a larger number of monocular 
judgments was obtained in Experi- 
ment II which differed from Experi- 
ment I chiefly in that ellipses as well 
as rectangles served as standards. 
When applied to the results of Ex- 
periment II only one of the methods 
of data analysis described above 
yielded support for the invariance 
hypothesis: for three of the five Os, 
the increased underestimation of 
slant which occurred at intermediate 
angles of inclination was accompanied 
by decreased constancy of shape. 

Stavrianos explained her failure to 
obtain a precise relation between ap- 
parent shape and apparent slant on 
the grounds that slant judgments 
made under the conditions of her 
first two experiments did not ac- 
curately represent the slant regis- 
tered by the observer when he made 
his shape judgments. ‘‘The percep- 
tion of both tilt and shape when they 
are merely registered as background 
or as incidental parts of the total 


tangles taller than wide to 

shorter than wide. These stimulus 
forms were mounted together on a 
rectangular background, the slant of 
which O was required to match by 
adjusting the slant of a comparison 
rectangle, The results failed to sup- 
port the hypothesis of a precise 
shape-slant relationship, although er- 
rors in slant adjustments were ac- 
companied by approximately match- 
ing errors in shape judgments for 
some Os. 

It is possible that even in Stav- 
rianos’ third experiment, the explicit 
judgment of shape was not the same 
as the implicit registration of shape 
which occurred when O judged the 
slant of the standard. It may be that 
the invariance typo ce v^ 
apply to experimental si T 
which shape and slant are judged 
separately. 
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Attempts to Deal with the Problem of 
Explicit Judgment versus Implicit 
Registration 


Beck and Gibson (1955, Experi- 
ment I) attempted to test the in- 
variance hypothesis without requir- 
ing,O to make separate judgments of 
shape and slant. The O made his 
match simply by selecting a compari- 
son stimulus from a series of dif- 
ferently proportioned targets which 
were also slanted differently. Thus 
the judgments of slant and shape were 
“implicit in the same act of match- 
ing the standard object" (Beck & 
Gibson, 1955, p. 128). As standards, 
Beck and Gibson employed texture- 
less, luminous shapes viewed monoc- 
ularly, with motionless head, so that 
stimuli for slant were effectively 
eliminated. Comparison shapes were 
presented together on a single panel 
viewed under full-cue conditions. 
Each set of comparison objects in- 
cluded a number of shape-and-slant 
combinations which were projec- 
tively equal to the standard and a 
number of combinations which were 
not projectively equal. Beck and 
Gibson assumed that the invariance 
hypothesis would be supported by the 
choice of the former shape-slant 
combinations and contradicted by 
the choice of the latter. Between 
82% and 92% of the matches were 
projectively equal to the standards. 
However, these results are somewhat 
equivocal since the choice of com- 
parison shapes which are projec- 
tively equal to the standard gives no 
evidence that the observer was in any 
way registering or taking into ac- 
count the slant of the comparison. 
He might have been matching pro- 
jective shapes. In addition, choices 
of comparisons which were not pro- 
jectively equal to the standard may 
be in accord with the invariance hy- 
pothesis. For example, O may have 
overestimated the slant of the com- 
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parison shapes, and hence he may 
have chosen a shape with a shorter 
vertical axis than that required to 
give the same retinal projection as 
the standard. Furthermore, choices 
of comparisons which are projec- 
tively equal to the standard may in- 
volve a contradiction to the in- 
variance hypothesis. For example, 
O may have overestimated the slants 
and yet underestimated the vertical 
axes of the slanted shapes. 
Essentially the same kind of objec- 
tions apply to the attempts by 
Langdon (1953, 1955b) to avoid the 
problem of the relation between im- 
plicit registration and explicit judg- 
ment of slant. According to Lang- 
don, the assumption of an invariant 
relation between shape and slant 
carries the implication that con- 
stancy is either constant throughout 
the arc of slant or varies as a simple 
function of the angle of slant. Thus 
it should be possible to test the in- 
variance hypothesis without con- 
cerning oneself with slant judgments 
at all, One need only obtain shape 
judgments at a number of angles of 
orientation. When Langdon ob- 
tained such judgments for an oscillat- 
ing circle as a stimulus, he found ir- 
regular variations of constancy over 
the arc of slant. Repeated findings of 
such irregularities in the relationship 
between constancy and angle of slant 
led Langdon to the conclusion that 
the development of any simple in- 
variant shape-slant formula is im- 
probable. However, Langdon has 
overlooked the possibility that the 
relationship between apparent shape 
and apparent slant may be simple 
while the relationship between actual 
slant and apparent slant may be com- 
plex. If Langdon were to measure 
apparent slant, he might find thatit 
varies concomitantly with the irregu- 
lar variations in apparent shape. We 
must conclude that until an adequate 
technique is developed for obtaining 
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simultaneous slant and shape judg- 
ments, studies obtaining explicit 
judgments of shape and slant remain 
the most acceptable source of evi- 
dence regarding the invariance hy- 
pothesis. 


Concluding Remarks about the 
Invariance Hypothesis 


The review which has been com- 

pleted above revealed that the in- 
variance hypothesis rests on a pre- 
carious evidential base. Attempts to 
provide experimental confirmation of 
a precise relationship between ap- 
parent slant and shape have been 
unsuccessful. In addition, the force 
of the evidence which indicates a 
less rigid, general shape-slant rela- 
tionship is mitigated by the experi- 
mental results which contraindicate 
the assumption that this relationship 
obtains. 
. Another consideration relates to 
the sufficiency of the invariance 
hypothesis as a basis for predicting 
perceived shape. It would seem that 
the adequacy of the hypothesis de- 
pends on the possibility that the 
various factors whose influence on 
shape constancy has been demon- 
strated may be shown to affect per- 
ceived slant. Only if these factors 
can be demonstrated to exert their 
influence on shape perception by de- 
termining apparent slant can they 
then be incorporated readily into the 
invariance hypothesis. If their in- 
fluence is not channeled in this way, 
then they must be given independent 
status outside the shape-slant hy- 
pothesis. No systematic work on this 
question has been performed. 

The effects of angle of orientation 
on shape constancy may serve to 
clarify this consideration. As has 
been previously noted, any variation 
in constancy over the arc of slant 
appears to pose a serious problem for 
Koffka's theory since the invariant 
relation implies that the amount of 
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constancy remain the same through- 
out the arc of slant. Koffka (1935, 
pp. 233-234) commenting on Eis- 
sler's (1933) results attempted to ex- 
plain these variations by referring to 
the actions of "internal" and “ex- 
ternal forces." The former is a force 
set up by the “nonnormal orienta- 
tion” of the stimulus; and it tends to 
reduce the apparent slant, thus en- 
couraging constancy. This internal 
force does not increase in proportion 
to the angle of orientation. The dis- 
torted retinal image produces an ex- 
ternal force which is assumed to 
increase more rapidly than the angle 
of orientation. Since the external 
forces become stronger than the in- 
ternal forces as the angle of orienta- 
tion increases, constancy should de- 
cline with the increasing angle. 
Koffka's interpretation has been criti- 
cized by Langdon (1953) on several 
points. Here we may note also, that 
regardless of the applicability of 
Koffka's reasoning to Eissler's find- 
ings, there would still remain the 
task of explaining the results re- 
ported by other investigators regard- 
ing this relationship. 

However, an alternate explanation 
of the influence of orientation angle 
might be introduced which would be 
consistent with the invariance hy- 
pothesis. Perhaps the changes in con- 
stancy can be ascribed to shifts in 
the accuracy of slant perception as 
the angle of orientation increases. If 
the angle of orientation is progres- 
sively underestimated with incre- 
ments in orientation, then the ob- 
tained decreases (e.g., Eissler, 1933) 
in constancy would be expected. Con- 
versely, a consistent trend toward 
increasing accuracy of perception of 
slant with increasing objective slant 
would account for increasing con- 
stancy ‘with increasing slant. f 
course a linear relationship between 
the degree of accuracy and objective 
slant could not account either for the 
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results showing changes in constancy 
over part of the range of slant only 
or the findings of irregular variations 
of constancy. 

There is some evidence (Smith, 
1956; Stavrianos, 1945) which can be 
brought to bear on this interpreta- 
tion. Smith's experiment can serve 
as an illustration. Smith presented 
rectangular and circular stimuli at 
one of 6? of slant ranging in steps of 
10? from 0? to 50? with only the 
gradients of outline convergence and 
distortion as stimuli for slant. At all 
angles the mean perceived slant was 
much less than the objective slants. 
However 'for each condition, the 
percentage of error in perception, i.e., 
the difference between the actual and 
perceived slant, decreased regularly 
as the angle of slant increased for 
angles greater than zero" (Smith, 
1956, p. 214). Other experiments 
which examined the relationship be- 
tween perceived slant and objective 
slant over the full range of slant and 
under various conditions of observa- 
tion might reveal that different ap- 
parent-objective slant relationships 
obtain under different conditions. 
Data of this sort would have impli- 
cations for the invariance hypoth- 
esis, and might clarify the presently 
contradictory results regarding the 
effects of angle of slant on degree of 
constancy. 


EVALUATION OF THE EXPERIMENTAL 
METHODOLOGY 


An analysis of the experimental 
procedures which have been used in 
studying shape constancy and the 
shape-slant relationship suggests the 
need for considerable improvement. 
The following points deserve to be 
noted: 

1. Many of the investigations failed 
to obtain information concering the 
perceived slant of the targets. Nor 
were conditions created which would 
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allow the experimenter to make a re- 
liable assumption about the phe- 
nomenal orientation of the stimuli. 

2. Crude response measures were 
often employed. Thus Thouless 
(1931) and Nelson and Bartley 
(1956) asked O to draw the apparent 
shape of the target. In addition to 
the obvious ambiguities which are 
inherent in the drawing response, 
e.g., differences between Os in ability 
to draw what is seen, there is another 
flaw in the procedure which is 
peculiar to the shape constancy situa- 
tion. Suppose O is shown a circle 
slanted from the frontal-parallel plane 
on its horizontal axis and he is in- 
structed to draw what he sees. If 
normal conditions of observation pre- 
vail, and the angle of slant is not too 
great, O will probably see a circle 
which is slanted. How is O to repre- 
sent this percept in his drawing? 
Many Os will attempt a crude per- 
spective representation and draw an 
ellipse with an elongated horizontal 
axis. If the experimenter accepts 
this product without further in- 
quiry, he will conclude erroneously 
that constancy is incomplete. This 
shortcoming of the drawing as an 
indicator of perceived shape may be 
stated more generally: an unam- 
biguous representation of perceived 
shape-at-a-slant is difficult to obtain. 

Another illustration of an inade- 
quate response measure is Beck and 
Gibson’s (1955, Experiments II and 
III) forced-choice technique. The O 
must choose either the comparison 
which meets the projective require- 
ments or the one which satisfies the 
objective requirements. In this 
case, the results may be only a mis- 
leading artifact of the technique. 
The O may choose the comparison 
which is most like the apparent shape 
although neither comparison stimu- 
lus is judged to be the same shape as 
the standard. Lacking the oppor- 
tunity of making sensitive discrimi- 
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nations in the response system, O 
gives results which may be inter- 
preted erroneously as the absence of 
differentiation in the perceptual sys- 
tem. The same problem may arise 
when complete reliance is placed on 
O's verbal designation. In this case 
minor differences in apparent shape 
may be assimilated into undifferen- 
tiated broader language categories, 
e.g., the category of circles which may 
include slightly eccentric ellipses. 

3. On occasion the range of com- 
parison stimuli was not sufficiently 
broad. No allowance was made for 
exaggeration of objective shape or 
overstatement of projective shape. 
Also, as Gottheil and Bitterman 
(1951) point out, often no provision 
was made for perfect constancy. 
Thus, if the standard is a slanted 
circle and the comparison series is 
comprised of a set of ellipses, it may 
be possible to make a perfect pro- 
jective match; but it is impossible to 
make a perfect objective match. 
Along these lines is the case in which 
O's efforts to match the comparison 
to the standard along one dimension, 
e.g., width, requires that the com- 
parison assume a phenomenal shape 
different from the standard (Lichte, 
1952; Yensen, 1957). Under these 
conditions O is confronted needlessly 
with a conflict between the tendency 
toward a match representing phe- 
nomenal equality of shape and the 
performance required by the experi- 
menter. 

4. Several investigators failed to 
specify the instructions which were 
given to O. As a result, it is not al- 
ways possible to compare the findings 
of different experiments with con- 
fidence that the Os were actually 
performing the same task. In addi- 
tion, the instructions supplied by 
some experimenters were vague and 
did not make clear to O which,of the 
several possible matches was de- 
sired. In these instances, it is pos- 
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sible, that different Os were match- 
ing for different aspects of shape, and 
also that the same O was not con- 
sistent on the several trials or under 
the several conditions for which he 
was tested. The results of Joynson's 
(1958a, 1958b) studies of perceived 
size and Joynson and Newson's 
(1962) study of perceived shape show 
that these concerns are justified. 
Joynson and Newson gave their Os 
instructions which were intentionally 
vague. The Os were told to select 
from a series of comparison tri- 
angles "the one that looks most like 
the one you are going to see" (Joyn- 
son & Newson, 1962, p. 3). The Os 
responded in various ways. Type R 
Os (62%) made no distinction be- 
tween objective and nonobjective 
(phenomenal or projective) equality 
and tried to match for "the real 
shape." These Os produced matches 
which were close to objective equal- 
ity regardless of the inclination. 
Type RN Os (38%) were spon- 
taneously aware of the different pos- 
sible interpretations of the instruc- 
tions, i.e., objective or nonobjective. 
The frequency with which these Os 
made nonobjective judgments (phe- 
nomenal or analytic) increased as the 
angle of inclination increased. When 
a nonobjective interpretation was as- 
signed to the instructions O disre- 
garded the slant and typically pro- 
duced — "compromise" judgments. 
However, Type RN Os who matched 
for objective equality showed an 
even higher degree of constancy than 
the Type R Os. This occurred be- 
cause the Type RN Os were more 
deliberate in taking the inclination 
into account. 

5. Experiments differ from each 
other along dimensions whose effect 
on apparent shape has in most cases 
not been subjected to independent 
systematic investigation. These un- 
controlled and unassessed differences 
create difficulties in interexperiment 
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comparisons. Some of the procedural 
differences were these: 


it is misleading to compare the re- 
sults of different investigators simply 


a. In some studies (e.g., Angrist, 1954; 
Eissler, 1933; Klimpfinger, 1933a, 1933b; 
Leibowitz & Bourne, 1956; Leibowitz et al., 
1957; Sheehan, 1938; Thouless, 1931) the 
standard shape was slanted and was to be 
matched by a frontal-parallel comparison 
shape. Other experimenters (e.g., Langdon, 
1951, 1953; Lichte, 1952; Yensen, 1957) 
presented the standard in the frontal-paral- 
lel plane to be matched by a comparison 
stimulus at a particular slant. 

b. Different means of producing the slant 
of the slanted stimulus have been employed. 
Most important is the fact that in some 
cases the stimulus object was rotated about 
its horizontal axis (eg., Moore, 1938; 
Stavrianos, 1945) and in other cases along 
its vertical axis (e.g., Langdon, 1951, 1953; 
Lichte, 1952). Muto's (1954) and Nellis’ 
(1958) findings showed that this apparently 
trivial procedural aspect may be important. 

c. The distance of the stimulus objects 
from O ranged from 75 cm. to 6 m. Some 
evidence that the distance of the target may 
be important was reported by Langdon 
(1953) and Gruber and Clark (1956). The 
latter investigators found that as distance 
increased, perceived slant decreased. 

d. The size of the similarly shaped test 
objects employed by different investigators 
has varied considerably. The possible con- 
foundings which may result from inatten- 
tion to this variable are suggested by 
Stavrianos' (1945) finding that the relative 
amount of the vertical horizontal illusion 
decreases as the horizontal extent of the 
stimulus form increases. In addition, Stav- 
rianos found that the mean estimated tilt 
for a large rectangle (180250 mm.) was 
greater than that of a small rectangle 
(80150 mm.). Both of these by-products 
of the size of the stimulus will influence 
apparent shape. 

e. In several cases (e.g., Beck & Gibson, 
1955; Epstein et al., 1962; Stravianos, 1945) 
O was able to view both the standardandthe 
comparison simultaneously. In other ex- 
periments the situation was arranged to 
make simultaneous observation impossible, 
This was accomplished by separating the 
stimuli by a sufficient enclosed angle (e.g., 
Langdon, 1951) or by presenting the stimuli 

successively (e.g., Leibowitz et al., 1954). 
Fragmentary evidence reported by Moore 
(1938) and by Joynson and Newson (1962) 
suggests that these conditions may lead to 
differences in the amount of constancy. 


6. Different quantitative measures 
of constancy were used. "Therefore, 


in terms of “amount of constancy.” 
An advance toward a systematic 
analysis of shape constancy would be 
achieved if a single measure were 
agreed upon. Of those which have 
been employed, the Thouless-Bruns- 
wik formulae seem to have little to 
recommend them. In addition to the 
objections which Koffka (1935, pp. 
226-227) and Brunswik (1940) have 
raised, there are the following re- 
strictions on the usefulness of these 
formulae. 

The application of the formula is 
restricted to the case in which the 
comparison stimulus is in the frontal- 
parallel plane. It is only in this case 
that the value for the apparent shape, 
i.e., a, can be assigned safely. In the 
case of a comparison which is 
slanted also, the assignment becomes 
problematical. The value may be 
designated by the objective dimen- 
sions of the match, but it also may 
be designated by the projective di- 
mensions of the match. An un- 
critical decision in favor of the former 
implicity assumes the validity of the 
shape-slant hypothesis and shape 
constancy. If the projective dimen- 
sions are selected, then the perplex- 
ing circumstance is created in which 
a perfect objective match will yield 
less than 100% constancy. 

An implicit assumption underlying 
the use of the formula is that a shape 
viewed normally in the frontal- 
parallel plane will be perceived 
veridically. However, in this special 
case where r=p the use of the 
Thouless-Brunswik ratio to express 
the outcome can be misleading. 
Thus, a perfect match would yield a 
ratio equal to zero, and only matches 
which exceeded the objective dimen- 
sions of the standard would yield 
positive values falling between 0 and 
1.0. 

One alternative would be to ex- 
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press constancy as the amount of 
compensation (Nellis, 1958). The 
degree of compensation is the ratio of 
the dimensions of the shape chosen 
to match the standard to the projec- 
tive dimensions of the standard. 
This may be written as a/p, or, if 
one prefers, log (a/p). This formula 
is not subject to Koffka's criticisms 
concerning restriction of range and 
constancy values beyond unity. Nor 
is it inapplicable in the special case 
where r — p since a logrithm of 0 in- 
dicates that the organism has done 
no work, ie. has not compensated 
for any difference between a and f. 
However, it does suffer from the first 
of the two restrictions noted above. 


CONCLUSION 


The perceptual constances have 
played an important role in the de- 
velopment of perceptualtheory. They 
have served the purposes of diverse 
and opposed theoretical formula- 
tions. Despite this prominence there 
is surprisingly little in the way of 
well established functional relation- 
ships in this area. With the possible 
exception of size constancy theo- 
retical speculation has far outdis- 
tanced (or disregarded) the experi- 
mental evidence. An illustration of 
this state of affairs is to be observed 
with regard to the constancy of 
shape. The only remedy for this 
condition is more experimentation 
with the aim of identifying the de- 
terminants of shape constancy and 
describing their interaction. Hope- 
fully, a theory formulated on this 
basis will be more adequate to the 
tasks which are required of it. 
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In basic research with factor analysis, irregularities of sampling ex- 
perimental variables make computer rotations and oblique solutions of 
dubious value. Both are functions of the selection of variables. Criteria 
of simple structure are sufficiently loose to permit different rotational 
solutions to satisfy them. Illustrative examples are given, comparing 
solutions with quartimax, varimax, and orthogonal graphic rotations. 
Only the latter gave satisfactory psychological meaningfulness. 


Since the elaboration of multiple- 
factor theory and the development of 
extraction methods more than 25 
years ago, it has been known that the 
computational solutions permit an in- 
finite number of possible positions for 
the reference axes. The location of 
each axis for an extracted factor is 
entirely a function of the variables 
intercorrelated and of the method of 
extraction. Psychologists who use 
factor analysis as an analytical tool 
are likely to desire that in the final 
solution each axis shall be in such a 
position with respect to the experi- 
mental variables that it can be inter- 
preted as representing a clearly 
meaningful and reasonable underly- 
ing variable. To achieve this goal, 
most investigators have found it 
necessary to rotate the axes repre- 
senting the extracted factors to new 
positions; positions as free as possible 
from the incidental influences of the 
unique combination of experimental 
variables and of the method of extrac- 
tion. 


Two General Uses of Factor Analysis 


In some uses of factor analysis, the 
objective, whether explicit or implic- 


1 Based in part upon a paper read in a 
Symposium on Selection of Variables for 
Factor Analysis at the annual convention of 
the American Psychological Association, in 
Chicago, September 1960. 


it, is merely to achieve some degree 
of parsimony by substituting a 
smaller number of representative 
variables for a multiplicity of experi- 
mental variables. The goal may be 
frankly that of operational conven- 
ience, rather than to discover any- 
thing of basic psychological signifi- 
cance. Such an application might be 
made to variates in the form of rat- 
ings of different qualities of objects 
or persons. Rotations of axes observ- 
ing objective criteria might lead to 
more meaningful underlying dimen- 
sions, but it would probably be an 
exception rather than a rule to find 
that such dimensions are of a basic, 
psychological nature. 

In the discussions to follow, em- 
phasis is placed on the second major 
use of factor analysis, whose goal is 
to discover basic psychological vari- 
ables. Such a use imposes upon the 
investigator considerably greater re- 
quirements in planning his study and 
more demanding standards for rota- 
tion of axes. It will be demonstrated 
that no known objective criteria for 
rotation are fully adequate, and the 
reasons for this situation will be 
pointed out. 


Simple-Structure Criterion 


The rotational criteria most com- 
monly used are Thurstone’s goals of 
positive manifold and simple struc- 


289 


290 


ture. Positive manifold is useful in 
analyses in which the correlation co- 
efficients are all essentially positive 
before or after reflection of some test 
vectors. The criterion of simple 
structure has more general applica- 
tion and has been given greater atten- 
tion (Thurstone, 1947). 

Unfortunately, as will be demon- 

strated, the criterion of simple struc- 
ture is not completely dependable. 
There is usually much latitude in 
satisfying the criterion and conse- 
quently much room for the operation 
of subjective judgment. Although 
this subjective element has often 
been an aid to gaining better psycho- 
logical understanding, at the same 
time it has been a source of irritation 
to scientific and mathematical con- 
sciences. 

As a general rule, the simple-struc- 
ture model may serve as the best ob- 
jective guide to the location of the 
basic descriptive dimensions of hu- 
man nature, There is little in the 
way of proof, however, either logical 
or empirical, that this is true. Even 
if true as a principle, there is no as- 
surance that in a particular analysis 
the obtained simple structure applies 
other than to the arbitrarily selected 
set of experimental variables. There 
is no guarantee of invariance of loca- 
tions of psychologically-meaningful 
axes from analysis to analysis with 
some changes in sets of experimental 
variables. Furthermore, because of 
the looseness of specifications for 
simple structure, there are nearly al- 
ways several possible simple-struc- 
ture solutions to the same analysis, 

The question remains, ‘‘Which one of 
these best represents psychological 
reality?" 

Owing to the degree of latitude in 
specifying simple structure, the goal 
of simple structure can often be satis- 
fied by making either oblique or 
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orthogonal rotations. Theoretically, 
an oblique solution can go fur- 
ther toward satisfying simple-struc- 
ture specifications mathematically 
because of the extra freedom per- 
mitted in departures from 90-degree 
angles among the axes. But, as will 
be demonstrated, this may be a hol- 
low and accidental victory for oblique 
methods of rotation. In addition, it 
may provide misleading information 
concerning the exact nature of the 
factors and concerning their inter- 
correlations. 


Analytical Rotation Methods 


In recent years, a number of ap- 
proaches have been invented for ro- 
tating toward simple structure by 
analytical methods. These methods 
are based upon mathematical defini- 
tions of certain aspects of the simple- 
Structure concept. Not all of Thur- 
stone's specifications for simple struc- 
ture can be expressed economically in 
mathematical form. Each method 
has its own peculiarities, depending 
upon which aspects of simple struc- 
ture are emphasized and upon the 
choice of orthogonal versus oblique 
type of solution. 

Many investigators who employ 
factor analysis as a research tool have 
welcomed one or more of the analyt- 
ical methods with open arms. These 
methods promise results “untouched 
by human judgment." The additional 
fact that they have been programed 
for high-speed computors has fur- 
nished the clinching argument in 
their favor. 

The main purpose of this article is 
to point out that complete depend- 
ence upon mechanical rotational so- 
lutions can lead the investigator 
astray. If mathematical criteria for 
à rotational solution are adopted, an 
analytical method will accomplish 
what it is designed to do. The critical 
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question is, given the empirical data 
such as are usually gathered for a 
factor analysis, will the final results 
satisfy the needs of the psychological 
investigator? After all, the ultimate 
criterion of the psychological investi- 
gator is increased psychological un- 
derstanding, not simply the applica- 
tion of a neat computational system 
(Guilford, 1961). Ends and means in 
research should always be clearly 
discriminated. 


SAMPLING PROBLEM 


The crux of the rotation question is 
a sampling problem. Effects of sam- 
pling in factor analysis have been 
discussed by a number of leading au- 
thors, but the kind of sampling 
treated has referred to populations of 
individuals (when dealing with the 
R-model analysis). There is also a 
problem of sampling of experimental 
variables (which will be referred to as 
"tests" hereafter, for convenience). 
There is a universe of tests, of which 
any one analysis can include only a 
limited number, for obvious opera- 
tional reasons. A successful solution 
to a factor analysis is quite depend- 
ent upon the selection of the tests 
to be analyzed. 


Sampling of Tests 

It is well known that factors will 
not emerge from an analysis unless 
tests of those factors are included for 
analysis. Every factor must be rep- 
resented by a minimum of two tests. 
For overdetermination of the rota- 
tional solution, of course, there must 
be more than two. But the matter is 
not as simple as that. 

The distinguishing of one factor 
from another and the locating of 
factor axes depend upon an adequate 
and appropriate sampling. It is not 
simply a matter of how many tests 
there are for a factor. It is more im- 
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portant where the test vectors lie in 
the common-factor space. The 
judicious choice of tests may require 
much knowledge gained from previ- 
ous analyses. Without such informa- 
tion, good a priori hypotheses con- 
cerning the nature of the expected 
factor is required. The worth of such 
hypotheses depends upon the psycho- 
logical insight of the investigator. 
He should be able to make predic- 
tions as to which tests are likely to 
have zero, or near zero, intercorrela- 
tions in the population of individuals 
and which ones will have substantial 
intercorrelations. Even with con- 
siderable prior knowledge and with 
the best intentions as to test selec- 
tion, it sometimes turns out that the 
factors are not optimally represented 
by tests, as the following example will 
show. 


Example of an Unfavorable Sampling 


Figure 1 shows a plot after orthog- 
onal rotations of axes in a problem 
with 70 tests and 18 factors (Guilford 
& Zimmerman, 1956). The tests were 
short inventories. On the basis of 
prior factor-analytic findings, five of 
the inventories were hypothesized to 
be measures of the factor of Ascend- 
ance or Self-assertion (A) and seven 
were likewise hypothesized to meas- 
ure the factor of Sociability or Gre- 
gariousness (S). The 12 variables 
are described in terms of trait names 
and sample items as follows? The 
numberings are those assigned in the 
study to which we have referred. 

6. Being conspicuous: Do you dislike to 
have people watch you while you are 
working? 

1. Maintaining one's rights: Do you ever 
protest to a waiter or clerk when you 
think you have been overcharged? 

8. Self-defense: Are you rather good at 


2 Items are quoted by permission of the 
Sheridan Supply Company. 
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Dos 


bluffing when you find yourself in a 
difficulty? 

Social initiative: Do you ever take the 
initiative to enliven a dull party? 

Fear of social contacts: Have you ever 
been hesitant to make application for a 
job in person? 

Liking for friends and acquaintances: 
Do you generally enjoy getting ac- 
quainted with most people? 

Social leadership: In social conversa- 
tions, are you usually the listener rather 
than the talker? 


Loading on Gregariousness 


33. 


033 


sof .30 and above on Factor Axes S—Gregariousness or Sociabil- 
Ascendance. (© =items designed for Factor S, []=items de- 
designed for Factor I [Confidence],  — items designed for some 


great deal to say or do the wrong thing 
in a social group? 

- Liking the limelight: Do you generally 
feel uncomfortable when you are the 
center of attention on a social occasion? 

- Shyness; bashfulness: Do you have to 

fight against bashfulness? ` 

Gregariousness: Do you like to mix 

socially with people? 

Liking for social affairs: Do you like to 

have many social engagements? 


It may be noticed that there is con- 
Social poise: Does it embarrass youa siderable logical similarity between 
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some of the variables in the A group 
and some in the S group. This can be 
attributed to lack of clear logical 
discrimination between the two hy- 
pothetical constructs A and S before 
the factor analysis. Item analyses 
had previously failed to separate the 
two groups of items more clearly 
because in item analyses the two 
provisional criterion scores were re- 
lated to both factors. That is, the 
score for A was contaminated with 
variance in Factor S and that for S 
was contaminated with variance in 
Factor A. As a consequence, some 
items were scored for both A and S in 
the earlier inventories from which 
these items came, thus contributing 
to a strong correlation between the 
two scores. Even when items are not 
scored for both A and S, as in the 
Guilford-Zimmerman Temperament 
Survey, there is substantial correla- 
tion because items scored for the one 
factor are weighted to some extent 
for the other factor. This correlation 
is sometimes taken to mean correla- 
tion between factors, an interpreta- 
tion that is probably erroneous, as 
subsequent discussion will show. 

It should not be surprising, then, 
to note in Figure 1 that most of the 
tests have substantial loadings on 
both factors. Only four of them, two 
for each factor, are essentially uni- 
vocal. In the rotation of axes in fac- 
tor analysis, the consequences are 
very significant. Because of the 
heavy concentration of tests midway 
between the two reference axes, an 
analytical solution that aims at 
maximizing vectors in hyperplanes 
would be strongly impelled to place 
an axis through that concentration. 
Thus Factors A and S would collapse 
into one. Not shown in Figure 1, the 
remainder of the 70 test vectors tend 
to cluster around the origin and be- 
tween the origin and the cluster of 


plotted points. This circumstance 
would also contribute to placing an 
axis midway of A and S. It is prob- 
able that analytical rotations in gen- 
eral tend to lose factors in such a 
fashion. 

Consider a few hypothetical cases 
of test sampling. If the factor analy- 
sis under consideration had included 
only Tests 7, 8, 32, and 33, among the 
12 heavily loaded ones, an analytical 
solution might well approximate that 
shown, which was accomplished by 
Zimmerman's graphic method (Zim- 
merman, 1946), the axes being lo- 
cated by inspection. On the other 
hand, if the sampling had excluded 
these four tests, there would also be 
agreement, but of a different kind, 
between graphic and analytical rota- 
tions. A single axis would run through 
the central cluster. 

Now such a factor would be inter- 
pretable psychologically, partaking of 
a combination of the properties of S 
and A. It would lack, however, the 
logical clarity and distinctness of the 
meaning of either Factors S or A. 
Turning attention to Tests 7 and 8, 
itcan be said that A pertains to stand- 
ing up for one's rights or asserting 
one's self. Considering Tests 32 and 
33 only, it can be said that S pertains 
to interest in being with people or 
sheer gregariousness. There is no 
implication of liking people or liking 
to be with them in A and no implica- 
tion of being socially bold in S. 

The zero or near-zero correlations 
between 7 and 8 on the one hand and 
32 and 33 on the other, indicate the 
probable independence of the two 
traits in the population of individuals. 
The observed correlations of Test 7 
(Maintaining one's rights) with 32 
(Gregariousness) and 33 (Liking for 
social affairs) were .06 and —.19, 
respectively. The correlations of 
Test 8 (Defending one's self) with 
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the same two tests were —.07 and .11, 
respectively. Like all facts, correla- 
tion coefficients are stubborn things. 
In a correlation matrix that, after 
systematic reflection of test vectors 
(if necessary), indicates a positive 
manifold, a 90-degree separation in 
rotating reference axes for tests cor- 
relating zero is most reasonable. 


SoME COMPARISONS OF GRAPHIC 
AND ANALYTICAL ROTATIONS 


The centroid-factor matrix, with 
70 tests and 18 factors, from the 
study used as an example, has been 
rotated by the normal varimax 
method (Kaiser, 1958) and also by 
the quartimax method (Neuhaus & 
Wrigley, 1954), using programs avail- 
able for the IBM 7090 and IBM 709 
computers, respectively.’ Only those 
aspects that illustrate features of 
analytical-rotation results as com- 
pared with graphic-rotation results, 
with special reference to the situation 
represented in Figure 1, will be men- 
tioned here. 


Approach to Simple Structure 


In terms of Thurstone's specifica- 
tions for simple Structure, the graphic 
solution provides a good, orthogonal 
simple structure, One specification is 
that there should be as many zero 
factor loadings in each column of the 
factor matrix per (factor) as there are 
factors. Counting any loading within 
the range —.1 to +.1 as essentially 
zero, the graphic solution yields an 
average of 33 zeros per factor, with a 
range of 19 to 42, where there are 18 
factors. 

The two analytical solutions show 
that it is possible mathematically to 


3 For assistance in computation, we are in- 
debted to the System Development Corpora- 
tion for the quartimax solution and to West- 
ern Data Processing Center for the normal 
varimax solution. 
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improve matters in this respect, as 
should be expected. Roughly speak- 
ing, the methods are designed to 
maximize the number of zero loadings 
while building up higher loadings in a 
limited number of places in the 
matrix. The varimax solution, also 
orthogonal, yielded an average of 39 
zero loadings per factor, with a range 
from 25 to 47. The quartimax solu- 
tion does a trifle better, with an aver- 
age of 40 and a range from 21 to 50. 

Another specification for simple 
structure is that each test shall have 
nonzero loadings in as few factors as 
possible. The Thurstone criterion is 
quite liberal in this respect, calling for 
atleast one zero. All three methods 
went far beyond this requirement. A 
somewhatrelatedrequirementimplied 
is that there shall be very few tests 
each having more than one substan- 
tial loading in common factors. In 
other words, the complexity of each 
test should be minimized. The ana- 
lytical methods, especially, aim to- 
ward this goal. The three solutions 
are compared on this basis. 

The usual custom is followed of 
considering loadings of .3 or greater 
as significant in the sense of indicat- 
ing enough variance in a factor for 
purposes of interpretation, In no 
solution was the complexity of any 
test greater than four; that is, no test 
had more than four loadings of .3 or 
greater. Most frequently the tests 
were of complexity one or two, The 
mean complexity for the graphic solu- 
tion was 1.9, for the varimax solution 
1.9, and for the quartimax solution 
1.6. Out of 70 test variables, the 
numbers with complexity greater 
than two were 16 for the graphic 
solution, 10 for the varimax, and 9 for 
the quartimax. Curiously, the vari- 
max solution had one test with a 
complexity (as defined above) of zero 
and the quartimax had two such 
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tests. The graphic solution had none. 
The communalities of the three tests 
in question were not low; they were 
.79, .70, and .45. Their common- 
factor variances were simply widely 
scattered by the analytical rotations. 
From the graphic solution the com- 
plexities of these three tests were 
three, two, and one. Thus it appears 
that an analytical solution is mathe- 
matically better for some tests and 
worse for others, in the way of yield- 
ing appropriate degrees of factorial 
complexity. 

Although it is apparent that the 
two analytical methods have ap- 
proached more nearly the require- 
ments of simple structure, it might be 
said that they have overshot the 
mark; they have carried things too 
far. If the aim is to satisfy some 
purely mathematical goals, the choice 
of rotational solutions is clear. Let us 
consider at what cost to psychological 
meaning these two particular solu- 
tions have achieved their technical 
superiorities. 


Fitting the Models to Psychological 
Meanings 


The crucial question that the psy- 
chologist should ask is, ''How well do 
the three solutions represent psycho- 
logical realities?" A more rigorous 
question is, "Which results will be 
most reproducible in spite of changes 
in sampling of tests of the same fac- 
tors?" In other words, ‘‘Which solu- 
tion promises greater invariance from 
analysis to analysis by different in- 
vestigators analyzing different test 
variables to represent the same fac- 
tors?" 

It is not begging the question to 


4 The authors know of at least one instance 
in another analysis in which a graphic solution 
yielded more zero loadings than did a varimax 
solution. Thus, this particular advantage is 
not always on the side of analytical methods. 


accept the factors of the graphic solu- 
tion as representing the best psycho- 
logical picture of the dimensions of 
personality involved. It is not neces- 
sary to rest the case in favor of the 
graphic solution upon the fact that it 
verified almost completely the hy- 
pothesized factors that guided the 
selection of test variables. Such fac- 
tors have been found independently 
by other investigators, the evidence 
for which has been summarized by 
the first author (Guilford, 1959). 
More recently, Comrey and Soufi 
(1961) have reported considerable 
supporting evidence in an analysis in 
which varimax rotations were used, 
Their study was conducted under 
more favorable test-sampling condi- 
tions than usual when analytical 
rotations are employed. . 

From inspection of results from the 
varimax and quartimax methods of 
rotation generally, the authors have 
gained the impression that these 
methods tend to throw the larger 
factor loadings toward a very few 
“stronger” factors and away from a 
larger number of "weaker" factors. 
By stronger and weaker factors is 
meant that a larger versus a smaller 
number of tests have potentially 
significant loadings on those factors. 
This feature is well illustrated by the 
particular factor analysis under dis- 
cussion. The graphic solution dis- 
tributes the numbers of significant 
loadings more evenly among the 
different factors. Of the 18 rotated 
factors in the graphic solution, only 1 
had as many as 15 significant loadings 
(.3 or greater). Half of the factors 
had from 5 to 9 significant loadings, 
much in line with the numbers of test 
variables selected to represent the 
hypothesized factors. The quartimax 
solution yielded 2 factors with 20 
and 29 significant loadings and 11 
factors with less than 5 such loadings. 
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Thevarimax solution produced a more 
even distribution of significant load- 
ings, but still showed 2 factors with 
more than 15 each and 7 factors with 
less than 5. 

One general effect upon psycho- 
logical interpretation is that the 
rotated factor with an unusually 
large number of significant loadings 
may seem to embrace a variety of 
qualities that are difficult to see as 
representing a psychological unity. 
At best, such factors can be recog- 
nized as confoundings of factors that 
would have been separated in graphic 
solutions. Zimmerman (1953) has 
called them “composite” factors. 

From the configuration of tests 
shown in Figure 1, it was predicted 
thata confounding of Factors A and 
S should be expected in analytical 
rotations. Both the varimax and 
quartimax solutions showed the pre- 
dicted confounding, but the factor 
was further confounded by the addi- 
tion of Factor I (Self-confidence 
versus Inferiority feelings). This 
composite factor will be referred to 
hereafter as Factor AIS. A second 
fairly obvious confounding was ob- 
tained in the two analytical solutions, 
more so in the quartimax solution, In 
the varimax solution this confounding 
included Factors N (Composure ver- 
sus Nervousness, Ag (Agreeableness 
or Friendliness versus General hostil- 
ity), and Co (Cooperativeness versus 
Hypercriticalness), The quartimax 
solution added to this trio Factors D 
(Cheerfulness versus Depression) and 
O (Objectivity versus Hypersensi- 
tivity). 

Of the 13 hypothesized factors, only 
two were brought out clearly by 
either analytical method: M (Mascu- 
linity versus Femininity) and R 
(Restraint versus Rhathymia). It 
might be concluded that the sampling 
of tests was sufficiently favorable for 
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these two factors to appear by any 
rotational method. In ‘the varimax 
solution, two additional factors are 
recognizable: N (Composure versus 
Nervousness) and T (Reflectiveness). 
Some of the N tests went into a con- 
founding of factors in both solutions, 
however. In the quartimax solution 
there were curiously two different 
factors loaded significantly on items 
for T, either of which could logically 
be taken to represent that factor. 


Comparison of Loadings from the 
Three Solutions 


Let us take a closer look at the 
analytical Composite Factor AIS in 
comparison with the Graphic Factors 
A, I, and S. In this discussion it will 
be helpful to have a clearer concep- 
tion of the five tests representing 
Factor I: 


17. Personal strengths and weaknesses: 
Are you physically stronger than most 
of your friends of the same sex? 

18. Feeling of adequacy: Do you always 
know what to do next? 

19. Self-confidence vs, inferiority feelings: 
Do you always feel that you can ac- 
complish the things you want to do? 

20. Discontent with self and status: Do you 
ever wish that you were taller or shorter 
than you are? 

21. Feeling of acceptance by others: Are 
you frequently afraid that other people 
will not like you? 


Imagine a three-dimensional plot 
involving Orthogonal Factors Ari 
and S. This would show within one 
octant a relatively high density of 
test vectors in a region near Tests 8, 
9, 10, 18, 19, 30, and 31. This is ap- 
proximately where the axis AIS was 
directed in both analytical solutions. 

Some investigators might prefer 
this aspect of the rotational solution 


* In the Aptitudes Research Project varimax 
rotations have been made on several occasions 
without finding satisfactory psychological 
interpretations. In every case graphic rota- 
tions have proved to be more satisfactory. 
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TABLE 1 


(COMPARISONS oF LOADING ON THE VARIMAX AND QUARTIMAX CONFOUNDED FACTORS 
AIS WITH LOADINGS BY THE ZIMMERMAN GRAPHIC ROTATIONS 
ON SEPARATE Facrors A, I, AND S* 


Loading Zimmerman 
Varimax Quartimax 
Test AIS AIS A I s 
PEERS werten enm cimus T 

6 .50 .59 49 E .26 

7 .37 .39 -14 .09 —.05 

8 at 43 46 22 .02 

9 .74 .76 .48 .16 .48 

10 44 .56 .40 A8 .29 

rae ee re Pee Se SR ree ieee TEES ee e n 

17 .28 31 13 ES .05 

18 .38 54 31 34 15 

19 .38 50 23 45 11 

20 .36 43 22 43 12 

21 43 41 27 .64 12 

27 .60 45 29 T 45 

28 .80 79 51 .05 59 

29 .66 72 29 46 43 

30 -15 75 49 —.07 53 

31 .70 73 44 27 47 
32 .48 36 14 .19 40 

33 .63 56 —.04 A5 77 


* The three blocks of tests have been designed for Factors A, I, and S, respectively. 


| as representing psychological reality 
and they would perhaps name Factor 

AIS, “social adjustment." There are 

at least two important things wrong 

with such a conclusion, despite the 
parsimony gained in substituting one 

concept for three. Offsetting the gain 

in parsimony is the loss of information 

j for failure to make significant dis- 
criminations that the data readily 

permit. This point is best illustrated 

by means of Table 1, in which the 

loadings in the two AIS factors are 

listed along side of the loadings in 

Factors A, I, and S found by the 
Zimmerman graphic method for the 

17 pertinent tests. In a number of 

instances it can be seen that two or 

more tests that have essentially the 

same loading on Factor AIS have 

| quite different patterns of loadings 
on the three Zimmerman factors. 

The principle just illustrated paral- 


lels another pertaining to measure- 
ment of traits in individuals. Two 
persons with identical scores for a 
composite trait such as AIS can have 
different combinations of scores for 
components such as A, I, and Si 
Whether the single composite scores 
are preferable depends upon profes- 
sional goals. There is no question 
about where the greater amount of 
information lies, for information 
means discriminations. 

The second point to consider is 
that the concept of social adjustment 
may not be entirely accounted for in 
terms of just three factors A, I, and S. 
Such a broad concept should prob- 
ably include other qualities, such as 
those described in connection with 
Factors O, Ag, and Co, which were 
named earlier. Each investigator or 
professional psychologist may have 
his own ideas as to what qualities 


| 
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should enter into the broad concept 
of social adjustment, Satisfactory 
communication requires general 
agreement concerning the constitu- 
ents of this concept. Like all terms 
in the description of personality, this 
one should be established by empiri- 
cal procedures. This goal is likely to 
be achieved with greatest invariance 
through a factor analysis that em- 
ploys an appropriate sampling of 
tests and an appropriate rotational 
method. 


ORTHOGONAL VERSUS OBLIQUE 
ROTATIONS 


For some investigators, the em- 
pirical approach to the determination 
of natural syndromes of personal 
qualities is through oblique rotation 
of axes and the finding of higher- 
order factors by factoring the first- 
order factors. There are many issues 
involved in this approach, the most 
important of which is the test-sam- 
pling problem. This issue has gen- 
erally been ignored. Let us consider 
it as applied to Figure 1, 

It is actually quite difficult to see 
where a graphic, oblique rotation 
would place the axes in Figure 1. If 
one were to give most weight to 
bounding hyperplanes (the near posi- 
tive manifold strongly suggests the 
reasonableness of doing so) an oblique 
rotation in the A-S plane might be 
very close to the orthogonal one 
shown. But suppose that Tests 7 and 
8 had not been in the battery. A 
hyperplane would then undoubtedly 
go through near Tests 18, 53, 10, and 
6. The new angle between the axis 

replacing A and Axis S would be 
definitely less than 90 degrees. Now 
assume that Tests 32 and 33 were 
absent from the battery. A hyper- 
plane might then be placed through 
Tests 29 and 27 and near 28. These 
two hypothetical sampling cases 
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merely illustrate that the accidental 
presence or absence of one or two 
tests may make very marked differ- 
ence in the nature of the oblique 
structure. 

Instead of concluding that because 
most of the tests designed for Factor 
A show some relation to Factor S and 
most of the tests for S have some 
relation to Factor A the two factors 
are correlated in the population, it 
appears more reasonable to conclude 
that there has been a failure in most 
cases either to construct or to select 
univocal tests for A and univocal 
tests for S. The indication in favor 
of this conclusion is that there are 
zero correlations between Tests 32 
and 33 on the one hand and Tests 7 
and 8 on the other. These facts 
strongly suggest the independence of 
the two factors. Even where there 
are no zero correlations between tests 
of two factors, it is not safe to con- 
clude immediately that the factors 
are correlated. The two qualities 
may be truly independent but there 
has been failure in test making or test 
selection. 

There seems to be a common belief P. 
sometimes amounting to an obses- 
sion, among factor analysts and oth- 
ers, that there are no genuinely zero 
correlations among traits and among 
measures of them. Spearman started 
the idea, which, among his British 
followers is taken as "proof" that 
there is always a general factor, and 
which, among Thurstone followers is 
taken as proof that there is always 
oblique factor structure. Actually, 
zero correlations are not nearly so 
rare as commonly supposed. Hun- 
dreds of them can be exhibited. 
There would be many more if the 
tests were constructed with sufficient 
experimental controls to keep their 
factorial complexity at a minimum. 

In many areas of personality and 
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in many populations, psychological 


factors are very likely interrrelated.* 
But the degree of this correlation 
between two factors is probably 
never exactly indicated by the ob- 

, tained correlation between two meas- 
ures representing those factors. It is 
difficult to know whether tests of two 
factors are as nearly univocal as they 
should be or whether the nonzero 
correlation between tests represent- 
ing two factors arise from cross con- 
taminations. These unanswered ques- 
tions also emphasize the dangers of 
accepting results from oblique rota- 
tions, whether graphic or analytical, 
as correct information concerning 
correlations of factors. 


SoME SUGGESTIONS CONCERNING 
Test SAMPLING 


Some of the remedies for the test- 
sampling problem are rather obvious. 
In general, it is necessary to sharpen 
conceptions of factors that are already 
known as well as those that are hy- 
pothesized but not yet discovered. 
A general theory of the domain with- 
. in which the analysis is being con- 
ducted, for example the theory of the 
structure of intellect, is very helpful 
(Guilford & Merrifield, 1960). In the 
work of the Aptitudes Project at the 
University of Southern California, it 
is now necessary only to keep in mind 
the parameters of the model and the 
distinct categories belonging to each 
parameter. By controlling a test as to 
content, operation, and product, the 
necessary differentiations and specifi- 
cations can be made, leading to fairly 
definitive tests for each new factor 
under investigation. The efficacy of 


© Elsewhere (Guilford, 1959; Guilford & 
Merrifield, 1960) the first author has accepted 
the logical probability of a general hierarchical 
model of personality, which, if it has common 
application, implies interrelated factors of 
different levels of generality. 


299 


this practice has already been demon- 
strated by a number of analyses, 
When by this approach a univocal 
test for a particular factor is not 
achieved, it can be assumed that at 
some point the necessary experi- 
mental controls were not applied in 
the construction of the test, in its 
administration, or in scoring. 

When a general theory is lacking, 
it is still important to arrive logically 
at distinguishable psychological con- 
structs. Being prepared to “go out- 
side the field" looking for a variety 
of new ideas is helpful. Narrow 
horizons restrict the view and limit 
the variety of potential factors and 
tests that can be conceived. In gen- 
eral, investigators need to relax some- 
what their drive to achieve parsi- 
mony. Long ago, psychology should 
have progressed beyond the stage in 
which investigators continue to look 
for the **philosopher's stone.” 

The discovery of new factors of 
basic psychological significance re- 
quires new hypotheses, and usually 
entirely new tests. As factor analyses 
are often carried out, there is still too 
much dependence upon the use of test 
variables that just happen to be 
available.” The total scores of some 
standard instruments have been ana- 
lyzed “to death" without adding any- 
thing of importance to psychological 
knowledge. 


Use of Marker Tests 


In order to segregate possible 
secondary sources of variance in new 
tests, the use of marker tests for al- 
ready known reference factors is 
always desirable. After all reasonable 
effort has been expended to make 
new tests as nearly univocal as pos- 
sible, there may still be some ap- 


1 This point has been previously elaborated - 
upon by the f first author (Guilford, 1952). — 
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preciable secondary common-factor 
variance. It is important to account 
for such secondary variance espe- 
cially if the same secondary common 
factor is likely to be involved in other 
tests in the battery. Another use of 
marker tests is to determine whether 
some new factor is, after all, identifi- 
able with one of those previously 
known. 

The sampling of marker tests pre- 
sents its own unique problems, in 
addition to the general problems of 
test sampling already mentioned. 
Accounting for only a few reference 
factors entails enlarging the test 
battery and hence testing time, where 
testing time is usually at a premium. 
The situation is further aggravated 
by the fact that univocal tests are not 
always available for the reference 
factors. The result is that in any one 
study the number of new factors that 
can be investigated is usually quite 
limited. 

It is often stated that there should 
be at least three tests of every ex- 
pected factor in an analysis, so as to 
overdetermine the location of the 
hyperplane for that factor. This is 
definitely good advice to follow in 
the case of new hypothesized factors 
and known factors that need further 
verification and clearer definition. 
In the Aptitudes Project, test-sam- 
pling procedures have been employed 
that reduce materially the number of 
marker tests needed in a battery, 
There are tests that are essentially 
univocal for some of the known fac- 
tors. If a test is known to function 
univocally in the experimental popu- 
lation and if there are one or two tests 
likely to yield significant loadings in 
the factor, one marker test will 

suffice. In graphic rotations, one can 
direct an axis through that univocal 
test's vector. 

If the univocal marker test is not 


particularly strong in the factor, its 


vector length being relatively short, it ` 


has been found useful to introduce 
two separately timed and separately 
Scored parts as markers, The worst 
than can happen, apparently, is that 
the specific variance generates an 
additional doublet factor, which is 
readily identifiable as such. If no 
doublet specific appears, the com- 
mon-factor and specific variances are 
probably confounded, resulting in 
somewhat higher loadings than usual 
in the marker test. Since we are not 
seeking information concerning the 
size of loading of the marker test, 
such spurious loadings can be tol- 
erated. 

The suggestions that have just 
been made concerning the use of 
marker tests add to the difficulties in 
the use of either analytical rotations 
or "blind" graphic rotations. With 
or without marker tests, where psy- 
chological understanding is the major 
goal, it is usually necessary to make 
use of prior knowledge to guide the 
rotation of axes, To expect that 
analytical rotations alone will solve 
this problem is to indulge in wishful 
thinking. It may be that procedures 
combining the insights of the investi- 
gator with the rigorous analytical 
Procedures will yield the best solu- 
tions. The second author is working 
along these lines at the present time.* 


SUMMARY 


In applying factor analysis in 
basic research there is a serious prob- 
lem of sampling of tests, a problem 
that has not been given due considera- 
tion. The kind of test sampling in an 
analysis determines both the ap- 
plicabilityof analytical rotation meth- 


* Under. contract NSF-G19489 with the 
National Science Fi oundation. 
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ods and the nature of oblique struc- 
tures where oblique rotations are 
made. Both procedures yield out- 
comes that are functions of the kinds 
` of tests that happen to be selected for 
analysis. Graphic methods of rota- 
tion are less restrictive, for allow- 
ances can be made by the competent 
investigator for peculiarities of the 
sampling of tests, ensuring the ex- 
traction of maximal psychological 
information from the analysis. 

The optimal use of factor analysis 
to isolate known factors and to isolate 
new ones requires clear psychological 
insights into the nature of the hy- 
pothesized factors, new tests with 
built-in and external experimental 
controls, and appropriate marker 
tests for reference factors. The last- 
mentioned specification further re- 
stricts the use of blind graphical 
rotations as well as analytical rota- 
tions. It is suggested that man- 
machine procedures may be de- 
veloped to overcome these and other 
difficulties. 
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AN ANALYSIS OF CUES TO AUDITORY DEPTH 
PERCEPTION IN FREE SPACE! 


PAUL D. COLEMAN? 
School of Medicine, Johns Hopkins University 


Physical acoustics reveals a number of stimulus correlates of sound 
source distance. Quantitative estimates of these stimulus correlates 
are compared with appropriate psychophysical thresholds. Such com- 
parisons show that most of these stimulus correlates can, with various 
restrictions, provide distance information detectable by the ear(s). The 
stimulus correlates dealt with at greatest length are: intensity, frequency 


spectrum at near and far distances, 
aural phase (or time) differences, P 
of these stimulus correlates as cues 


binaural intensity ratio, and inter- 
roblems relating to the use of some 
are discussed. Other possible dis- 


tance cues are briefly mentioned. The possible use of much of the avail- 
able physical information in making distance judgments has not yet 
been adequately evaluated in psychophysical studies, 


The study of the perception of 
external space has, for a long time, 
been one of the major problem areas 
of sensory psychology. A relatively 
high degree of sophistication has been 
achieved with regard to problems 
such as the visual perception of space 
and the auditory perception of azi- 
muth. Our knowledge of auditory 
depth perception, however, has re- 
mained primitive. One of the most 
basic aspects of this area has been 
only scantily explored: the definition 
of the stimulus correlates of auditory 
depth perception. 

Physical acoustics suggests several 
possible stimulus correlates of the 
auditory perception of distance. They 
may be classified as either monaural 
or binaural correlates. Monaural 
cues are derived from the alteration 
of sound by its passage through the 
conducting medium. Binaural cues 
are derived from the geometry of the 
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head and the behavior of sound waves 
in the region of the head. 

Although physical acoustics can 
point out the existence of information 
that may provide the basis for several 
possible cues to auditory distance, it 
rests with psychophysics to show (a) 
that this information can be detected 
by the ear, and (b) that it is utilized 
in arriving at distance judgments. 
These two points will be raised in the 
case of each of the possible cues sug- 
gested by physical acoustics. 


MONAURAL CUES OF AUDITORY 
DISTANCE 


Amplitude 


It seems a truism to state that 
amplitude, or pressure, of the sound 
wave is a cue in auditory depth per- 
ception by virtue of the attenuation 
of sound with distance. This loss of 
amplitude with distance, generally 
referred to as the (1/R) loss or the 
inverse first power loss, is expressed 
as: 


LII 1 
(5) loss” in db. = 20 logio (2) 


where R is the distance to Point P, 
and Ro is the distance to Reference 
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Point Po. It can be seen that the loss 
increases by 6 db. for each doubling of 
distance. Thisloss is the most general 
of the cues to distance, obtaining 
physically for all types of sounds and 
at all distances, though it may not 
always be detectable. Since the pres- 
ent paper is concerned with the free 
space situation, it is possible to dis- 
regard the major departures from the 
inverse first power loss caused by re- 
flecting surfaces and their effect on 
localization (Griffin, 1958; Lochner & 
de V. Keet, 1960). 

In order to determine the smallest 
magnitude of (1/R) loss that may be 
detected by the ear, one may refer to 
data such as those of Karlin (1945), 
Karlin and Stevens (1946), Miller 
(1947), or Reisz (1928). Karlin 
(1945) has found that judgments of 
“louder” or "softer" (with reference 
to a comparison displaced in time) 
are correct approximately 100% of 
the time when the overall sound pres- 
sure is changed approximately 1 db. 
(the frequency spectrum of his stimu- 
lus extended from 500-2,000 cps). 
Miller (1947) found the loudness 
difference limen (DL) for “white” 
noise to beasymptotic to .4 db. above 
about 20 db. sensation level (SL). 
The data of Reisz (1928) obtained 
with pure tones indicate a higher DL 
than was found by Karlin or Miller 
except in the case of the most favor- 
able combinations of frequency and 
SL, where the DL may be.5 db. A 
DL of 2 db. would seem, on the basis 
of Reisz' data to hold for a relatively 
wide range of frequencies at 40 db. 
SL or greater. 

From the data of Karlin and Miller 
'already referred to it can be seen that 
the (1/R) loss of .8 db. produced by a 
distance change such that (R/Ro) 
=1.1 should be detectable (depend- 
ing on SL) by the ear. According to 
the pure tone data of Reisz, this same 
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distance change should produce a 
detectable pressure change for fre- 
quenciesfrom approximately 64-8,192 
cps at 60 db. SL or greater. Ata 
distance of 10 feet this difference in 
sound pressure would result from a 1- 
foot increment. 

No accurate recent data are avail- 
able concerning the ratio of a just 
noticeable distance (jnd) increment to 
a reference distance (R/Ro). Reinson 
(1952) and Edwards (1955) have 
determined (R/Ro) (both present the . 
same data) using a ticking watch as 
the stimulus source. They obtain 
R/Ro=from 16.6 at 100 cm. to 
(R/Ro) =12.1 at 800 cm. (derived 
from mean error in judging "toward" 
or "away'"). Several other investi- 
gators have measured the DL for 
distance or "accuracy of auditory 
depth perception" under various con- 
ditions (Elkin & Tagamlitzkaya, 
1939; Fletcher, 1957; Ikenberry & 
Shutt, 1898; Roberto, 1946). All 
these data, including the Reinson- 
Edwards data, are clearly inappropri- 
ate to the determination of the rela- 
tion between (R/Ro) and the loudness 
DL since the stimuli either were not 
in the median plane or were not pure 
tones—thus allowing the possible 
intrusion of cues other than sound 
pressure into the distance judgments. 

Gamble (1909) attempted to dem- 
onstrate, under crude conditions, 
that intensity was a major cue in 
estimating distance of sounds by 
showing that the DL for distance 
could be predicted from the DL for 
intensity (sic) on the basis of an in- 
verse square law. This use of the 
inverse square, or 1/R? loss is in error 
since the ear responds to pressure 
(amplitude) which falls off with 1/R. 
Power falls off with 1/R?. Although 
expressing these quantities as decibels 
should yield the same effect of dis- 
tance on decibels for both, it is useful 
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to keep this difference in mind in 
order to be certain the proper deriva- 
tion is used (10 log P;/P; or 20 
log 4i:/As;) and to avoid error in 
comparing results of several investi- 
gators which may be expressed in 
terms other than decibels. Unfortu- 
nately, the data available at that time 
concerning the loudness DL were ex- 
tremely crude, and sufficient informa- 
tion was not given by Gamble to 
allow an exact use of more recent data 
on the loudness DL. However, 
Gamble did conclude that the DL for 
distance is probably closely related to 
the DL for intensity. A more pre- 
cisely controlled experiment using 
"pure" tone stimuli (the purity of the 
stimuli may be questioned) in the 
median plane (Arps & Klemm, 1913) 
showed that the detection of a change 
in distance does not occur until the 
loudness DL has been greatly ex- 
ceeded. Thus it appears that the 
experimentally determined DL for 
distance may be larger than the DL 
for distance predicted on the basis of 
the DL for loudness. The authors 
both suggest and demonstrate that 
cues other than sound pressure (pre- 
sumably '"timbre") were present in 
their experimental stiuation. This 
suggests that the DL for distance 
based on sound pressure cues alone 
was larger than the corresponding DL 
for loudness by an amount even 
greater than their experimental deter- 
minations predict. This apparent 
discrepancy between the predicted 
and experimentally determined DL 
for distance requires resolution. This 
discrepancy might be reduced if the 
distance DL were determined in a 
dynamic (moving sound source) situa- 
tion (the determinations quoted above 
were all obtained in static situations) 
more closely approximating the “real 
life" localization process. In an 
angular localization task binaural 
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intensity disparities as great as 200:1 - 


have yielded no localization judg- 
ments under certain conditions in a 
static situation (Stewart, 1917). Ford 
(1942) speculated that considerably 
smaller binaural intensity disparities 
would yield angular localization judg- 
ments in a dynamic situation, and he 
presented data which support this 
speculation. 

It is furthermore worthy of note 
that determinations of loudness DL 
are generally derived from experi- 
mental situations in which the transi- 
tion between stimuli is at a relatively 
rapid rate. Such conditions serve to 
minimize the DL. Determinations of 
the DL for distance, on the other 
hand, usually involve a stimulus 
change over a longer period of time— 
which would serve to increase the DL. 
The only experiment to which this 
point does not seem to apply is that of 
Arps and Klemm (1913). This con- 
sideration must also be kept in mind 
in the application of other types of 
psychophysical data to localization 
Situations, as must considerations 
such as stimulus duration, level, etc., 
which may influence psychophysical 
functions. For these reasons par- 
ticularly, estimates presented in this 
paper of detectable distance change 
based on the DL for other psycho- 
physical functions must be regarded 
as very rough approximations at best. 

Studies such as those of Gamble 
(1909), Békésy (1938, 1949), and 
Steinbergand Snow (1934) haveshown 
that the (1/R) loss is utilized to some 
extent in arriving at distance judg- 
ments. In these studies judgments of 
distance were modified by changing 
the sound pressure from a source at a 
fixed distance. Trimble (1934) found 
that in the case of dichotic stimula- 
tion, a change in intensity alters ap- 
parent distance when the binaural 
intensity ratio is 1:1. 
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In spite of the major role that is 
usually attributed to sound pressure, 
it is not the only cue in auditory 
depth perception. The experiments of 
Mohrmann (1939) demonstrating a 
degree of loudness constancy when 
only auditory cues to distance of the 
stimulus source were available, and 
studies (Arps & Klemm, 1913; 
Fletcher, 1957) indicating ability to 
make reasonably accurate relative 
distance judgments in the presence of 
conflicting loudness cues or in the 
absence of loudness cues led us to con- 
sider other possible cues to auditory 
depth perception. 


Frequency Spectrum 


Some of the earliest experimenters 
in the area of auditory localization 
(Angell & Fite, 1901; Gamble, 1909; 
Starch, 1911, 1912, 1915, 1916; 
Starch & Crawford, 1909) mentioned 
frequency spectrum as a cue to the 
distance of a source of sound. Stimu- 
lus frequency spectrum was not itself 
an experimental variable in these ex- 
periments; it was usually mentioned 
because the subjects often reported 
that at times their distance judgment 
were influenced by the pitch or qual- 
ity of the stimulus. However, no con- 
sistent use of this cue was uncovered. 
“Sometimes the farther sound seemed 
higher in pitch and sometimes the 
nearer one" (Starch* & Crawford, 
1909). 

Further indirect support for the 
role of frequency spectrum in auditory 
depth perception came from Mohr- 
mann's (1939) studies demonstrating 
loudness constancy in the absence of 
nonauditory information as to the 
location of the loudspeakers that 
served as stimulus sources. Obvi- 
ously loudness constancy in the ab- 
sence of nonauditory cues as to the 
location of the sound sources, as ap- 
pears to have been demonstrated in 


this experiment, depends on informa- 
tion about the distance of the source 
of sound transmitted via the auditory 
channel. Stimulus loudness was not a 
source of this information because it 
was an experimental variable. Since 
the demonstrated constancy varied 
among the stimuli (speech, noise, 
music, pure* tones, and metronome 
clicks) as a function of the spectral 
complexity of the stimulus, and the 
presumed familiarity* of the subjects 
with the stimulus, the stimulus fre- 
quency spectrum may have been one 
of the distance cues. Klemm's (1913) 
study of the ability of a monaural 
person to locate sounds provides fur- 
ther indirect support for the role of 
frequency spectrum in distance per- 
ception, as do the experiments of 
Shutt (1898) showing distance locali- 
zation to be more accurate for a 
“complex” sound than for the sound 
from a pitch pipe, and those of Arps 
and Klemm (1913) demonstrating 
distance localization in the median 
plane when intensity is removed as a 
cue. 

Later investigations were more 
directly concerned with frequency 
spectrum as an experimental variable 
(Békésy, 1938; Hornbostel, 1923) in 
the study of auditory distance per- 
ception. Although these studies are in 
general agreement that the frequency 
spectrum of a sound stimulus can in- 
fluence the apparent distance of the 
sound source, there is conflict as to 
the basis and direction of this influ- 
ence. 

Békésy (1938) has obtained data 
indicating that, at distances less than 


3 On the basis of statements made by the 
author, it seems that the tones had a rela- 
tively rich harmonic content. 

4 [t has been experimentally demonstrated 
(Coleman, 1959) that ability to localize the 
distance of sources of sound is related to the 
number of exposures to the stimulus. 
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4 feet, the source appears to approach 
the observer as the lower frequencies 
in a sound stimulus become more 
prominent. Békésy's analysis of the 
situation revolves around the equa- 
tion expressing particle velocity, u, 
in a spherical sound field as the fol- 
lowing function: 


where r —distance from the center of 
the sphere and c=the velocity of 
sound. Békésy states that auditory 
depth perception is determined by 
the comparative sizes of the two parts 
of the right hand side of the equation: 
the flow velocity, f(t); and its first 
time derivative, f’(t) (related to the 
corresponding pressure change). 
Since one expression contains 7, and 
the other r?, their relative magnitude 
changes asa function of distance from 
the source, f(t) becoming more im- 
portant at smaller distances. He 
demonstrated experimentally that an 
increase in the magnitude of f(d), 
accomplished by integrating an elec- 
trical signal before applying it to the 
sound source, brings the apparent 
source of sound closer to the observer, 
Conversely, differentiating the elec- 
trical signal, thereby increasing the 
magnitude of f’(#), makes the sound 
source appear to be farther away, 
Békésy points out that harmonic 
analysis shows the pressure change 
related to the function f'(t) to be 
deficient in low harmonics relative to 
f(t). Since an increase in the magni- 
tude of f(t) causes the sound source to 
appear closer to the observer, an in- 
crease in the low frequency content 
of the stimulus (clicks) should, there- 
fore, cause the stimulus to appear 
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closer. Békésy found this to be the 
case for an actual source distance of . 
25 cm. A decrease in the low fre- 
quency content caused the stimulus 
to seem more distant. 

The equation presented by Békésy 
is descriptive of spherical sound fields 
only, and the experiments conducted 
by Békésy did not extend beyond 
distances at which sound waves 
emanating from a source cease to be 
approximately spherical. 

It is known, that in addition to the 
effect described by Békésy, high fre- 
quencies are differentially attenuated 
relative to lower frequencies by pas- 
sage through air (Ingard,1953; Nyborg 
& Mintzer, 1955). This loss, which is 
also in addition to the inverse first 
power loss, is known as the expo- 
nential loss. This loss is expressed in 
decibels as a (R— Ro) where R— Ro 
=the distance between source and 
observation point and a — the absorp- 
tion coefficient which expresses the 
loss in decibels per unit distance. The 
absorption coefficient, a, is composed 
of two portions having differing physi- 
cal origins: the classical absorption 
coefficient, dems) and the molecular 
absorption coefficient, amoi (a= aoi 
"Fess).. Gotass i$ small (e.g., approxi- 
mately .6 db/1000 ft at 0° C., atmos- 
pheric pressure, and 4,000 cps) in 
relation to amo (approximately 25 
db/1000 ft under comparable condi- 
tions at 40% humidity) at audio fre- 
quencies and will not be discussed 
further. amoi is expressed by: 


RC, TE 
RAE 


af 


F [2] 


aol, m 
€ 


where: 


f sound frequency 
c * phase velocity of sound 
R 7 molar gas constant 
C, total heat capacity per mol of air at 
constant volume 
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fm —a function of humidity approximately 
according to fm = 1.01 X10% h? where h 
is in grams per cubic meter 
C; vibrational heat capacity per mol. 
@moi=absorption coefficient in nepers/cm. 
Loss in db. =8.68 loss in nepers. 


Sample values of the absorption 
coefficient (over and above the 1/R 
loss) computed from this equation 
(with the classical absorption coeffi- 
cient added) are presented in Table 1. 
The computed values of "a" have 
been corrected on the basis of results 
which show experimentally deter- 
mined values of “a” to be higher than 
the theoretical values. The two sets 
of “a” are derived from two different 
sets of experimentally determined 


TABLE 1 


ABSORPTION COEFFICIENT IN DB/100 FT AT 
50% RELATIVE HUMIDITY AND 20° C, 


Frequency 


. ay 
in cps 


Ll 
[3 


ta oi» O tac tato to 
SOWARUSAwWL 


[^ em 
Re UNN em 


Note.—a is absorption coefficient, 


correction figures (Nyborg & Mintzer 
1955). As can be seen from Equation 
2, these values are highly dependent 
on the absolute humidity (or relative 
humidity and temperature). The 
values of temperature and humidity 
chosen for the examples in Table 1 
were taken to represent typical condi- 
tions in a temperate climate. Under 
other climatic conditions the absorp- 
tion coefficient would be different by 
as much as several thousand percent 
(depending on frequency). Terrain 
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and inhomogeneities in the atmos- 
_— may also influence the value 
of a. 

Referring to the data of Karlin 
(1945), we see that the ear can detect 
changes in the energy distribution of 
a complex noise (frequency spectrum 
200-10,000 cycles) when the maxi- 
mum change at the extremes of the 
spectrum are less than 1 db. On the 
basis of these data Karlin concludes 
“the ear can detect an astonishingly 
small shift in the energy distribution 
of a complex noise.” 

The data of Table 1, in light of 
Karlin’s results, show that distance 
changes of the order if 20-30 feet may 
produce frequency spectrum changes 
that can be detected by the ear (under 
the conditions assumed for the deriva- 
tion of Table 1), particularly when 
the stimulus spectrum includes the 
higher frequencies. At other values of 
temperature and humidity this esti- 
mate would vary considerably. 

Preliminary experiments (Cole- 
man, 1955) at distances between 8 
feet and 28 feet have indicated that 
the effect of frequency spectrum on 
apparent distance is in the direction 
predicted on the basis of the differ- 
ential attenuation of high frequencies 
by passage through air. Similar re- 
sults have also been reported by 
Hornbostel (1923). 

These data do not necessarily con- 
flict with Békésy’s data because of 
differences in the distances that were 
studied. It is entirely reasonable to 
suppose that when cues for distance 
can no longer be derived on the basis 
of a spherical wave front, other cues 
become predominant. Conversely, 
when frequency spectrum is less 
usable as a cue because of small 
differences in distance, there is the 
opportunity at near distance for 
this cue to behave in another way. 

Although it does appear that fre- 
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quency spectrum may influence dis- 
tance judgments, the degree of im- 
portance that can be attributed to a 
cue that is so changeable (at “far” 
distances) as a function of environ- 
mental influences is open to serious 
question. Whether the operation of 
this cue at far distances is minimal or 
confusing, or whether corrections are 
made for changes in environmental 
conditions (perhaps by learning or 
applying a new scale of spectrum 
change versus distance whenever 
necessary)remainsto be seen. Békésy 
(1938) has, in addition, pointed out 
the possibilities for complex inter- 
actions among this cue and other cues 
to distance. 

It seems that both intensity and 
frequency spectrum (at far distances) 
are useful only as relative cues to 
distance since both these stimulus 
correlates may change without any 
accompanying distance change. 
Thus, neither of these cues bears an 
invariantrelationto distance. Experi- 
mental evidence (Gamble, 1909) indi- 
cates that when subjects are allowed 
to respond either in terms of loudness 
or distance (regardless of which is 
changed) both responses are equally 
likely. In other words, comparison 
stimuli displaced in space or time are 
necessary for valid distance judg- 
ments. The same appears to be true 
of frequency spectrum at near dis- 
tances for Békésy (1938) has shown 
thatapparentdistance can be manipu- 
lated by frequency spectrum changes 
when the source is actually located 5 
to 50 cm. from the receiver. Coleman 
(1959) has shown that initial judg- 
ments of the distance of the source of 

an unfamiliar complex stimulus, pre- 
sented withoutany comparison stimu- 
lus, are unrelated to actual source 
distance. As experience with the 
stimulus increases, distance judg- 
ments become reasonably accurate. 


One may utilize distance units such 
as feet in expressing a judgment once 
sufficient experience with a particular 
stimulus has been accumulated, but 
the judgment remains a relative one. 


Pinna Effects 


Although it has long been known 
that the pinnae act on incident sound 
waves, it has only recently been sug- 
gested that these effects may, in part, 
relate to distance. Batteau (1962) 
points out that the pinnae delay in- 
coming signals through a number of 
different paths so that the resultant 
signal yields an autocorrelation func- 
tion which is characteristic of the 
orientation of the observer with re- 
spect to the sound source. Batteau 
suggests that such changes in the 
stimulus may yield information about 
distance, as well as azimuth and ele- 
vation. A quantitative evaluation of 
pinna effects related to distance has 
not yet been made, and no psycho- 
physical study of these effects is 
available. 


BINAURAL CUES TO AUDITORY 
DisTANCE 


Little attention has been given in 
the past to binaural cues to auditory 
distance. Although physical acoustics 
indicates possible binaural cues, psy- 
chophysics has thus far made little 
use of this information. 

The development of this area of 
physical acoustics can be attributed 
largely to Rayleigh (1945) and Stew- 
art (1911, 1914, 1916) who applied 
the Helmholtz theorem of reciprocity 
to the earlier work of Stokes (1868) 
which described the sound field pro- 
duced by a vibrating sphere. Their 
development of equations describing 
the sound field around a rigid sphere 
and their comments concerning the 


* D. W. Batteau, personal communication, 
1962. 
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application of these equations to 
localization problems formed the 
basis of much of the work that fol- 
lowed. The relationships that were 
developed by Rayleigh and Stewart 
are quite complex and the hand com- 
putation of a set of points from these 
equations is a tedious process—for a 
complete development of these equa- 
tions the reader is referred to Stewart 
(1911, 1914, 1916) and Rayleigh 
(1945). The most extensive set of 
curves so obtained appears to have 
been published by Hartley and Fry 
(1921). Their curves describe differ- 
encesin phaseand relative'‘intensity”’ 
(pressure) at two points (165° and 
180° apart) on the surface of a rigid 
sphere 8.75 cm. (approximately head 
radius) in radius. The data are given 
as a function of angular displacement 
and distance of a sound source for five 
frequencies from 310 cps to 1,860 cps. 
The distances dealt with are limited 
to 16.50 cm., 43.75 cm., 87.50, cm., 
and infinity. The set of curves com- 
puted for 310 and 1,860 cps for the 
case where the “ears” are 165° apart 
are presented in Figure 1. 

At 310 cps the binaural intensity 
ratio (BIR)—in actuality ratio of 
sound pressure at the two ears— 
changes appreciably with distance as 
a function of azimuth. At 0° and 
180° BIR does not, of course, change 
with distance. At 90°, where the 
distance effect is greatest, a change in 
source distance from 16.5 cm. to 
43.75 cm. produces a change in BIR 
from about .04 to about .29. At 40° 
the same change in distance produces 
a change in BIR from about .20 to 
about .47. As stimulus frequency in- 
creases the change in BIR with dis- 
tance decreases considerably except 
in the region of 90° azimuth. For 
complex sounds this differential fre- 
` quency effect produces a difference in 
frequency spectrum at the two ears, 
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which may also serve as a distance 
cue. No pertinent psychophysical 
data for evaluating this possibility 
are available. 

Changes in binaural phase as a 
function of distance are smaller at 
lower than at higher frequencies. In 
general, changes in binaural phase as 
a function of distance are greater at 
positions closer to the head and mid- 
way between the ears, these effects 
varying with frequency in a manner 
indicated by Figure 1. For example, 
at about 90° azimuth and 16.50 cm., 
where the phase effect is maximal, 
the phase difference between the two 
ears with a 310 cps stimulus is 89°. 
At 43.75 cm., under the same condi- 
tions, the phase difference is 87°. 
Witha1,240cps stimulus at about 90° 
azimuth the phase differences change 
from 330° at 16.50 cm. to 320° at 
43.75 cm. and 315° at 87.50 cm. At 
1,860 cps phase differences greater 
than 360° are computed. 

“Binaural” differences in sound 
pressure and phase were later meas- 
ured directly as a function of source 
distance and angular displacement, 
using a dummy head with micro- 
phones as ears (Firestone, 1930). 
The microphones were at the ends of 
tubes approximating the external 
meatus and a pinna was included. 
The amplitude ratios were somewhat 
smaller than those computed by 
Hartley and Fry while the phase 
differences varied as a function of 
distance to a greater extent than the 
values computed. Table 2 presents a 
comparison of selected theoretical 
results with the measured results. 

Woodworth and Schlosberg (1954, 
pp. 351-353) and Pieron (1952, p. 
241) discuss binaural time differences 
at near and far distances as a function 
of azimuth. Their treatments of this 
problem are similar and are based on 
the assumption that the sound wave 
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FiG. 1. "Binaural" intensity ratio (I) and phase difference (P) as a function of the angular 
displacement and distance (r) of a sound source. (k=2x/d, c - radius of the sphere, r distance 
of the sound source from the center of the sphere. See Hartley & Fry, 1921.) 
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travels directly around the circum- 
ference of the head at the speed of 
sound. Under these circumstances, 
phase can be directly derived from 
time. Woodworth and Schlosberg 
present values for time differences at 
near and far distances. The range of 
values is indicated in Table 3. 

The figures in parentheses are 
derived from the curves of Hartley 
and Fry (1921) for a 620-cps tone at 
90? at 16.50 cm. and infinity. Both 
sets of data assume a head diameter 
of 8.75 cm. Woodworth and Schlos- 
berg assume the ears to be at 180* to 
each other while Hartley and Fry 
assume them to be at 165°. The 
differences between the two sets of 
data are not entirely due to possible 


TABLE 2 
A COMPARISON OF SELECTED THEORETICAL 
RESULTS WITH THE MEASURED RESULTS 


Amplitude ratio Phase difference 
Distance ——————— ———————— 
(cm.) Ob- Theo- Ob- 


served retical served 


Theo- 
retical 


256 cps 100° azimuth 


400 -90 81 71° 73° 

100 -16 66 71° 70° 
50 57 50 72* 74° 
20 .28 22 73° 68 


1,024 cps 80° azimuth 


SSeS 


400 68 60 262° 240° 

100 .58 51 267° 249° 

50 AT ES 269° 265° 

20 25 21 277 — 307" 
SU: UD CTI Pel SES MT ta 


TABLE 3 
BiNAURAL TrME DIFFERENCE 
IN MILLISECONDS 
———————— 


Azimuth Near source Distant source 
0° .009 .009 
90° -799(.771) .653(.732) 
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differences in distance and placing 
of the ears, but also to the greater 
complexity of assumptions on which 
Hartley and Fry data are based. 
There can be little doubt that the 
ear can, under many conditions, de- 
tect the changes in binaural differ- 
ences in sound pressure resulting from 
changes in the distance of the source. 
Upton's (1936) data indicate that 
binaural sound pressure differences of 
from 1 db. to 3 db. (depending on 
loudness variation between 30 and 
120 db. SL) are detectable. Convert- 
ing Upton's data to BIR for compara- 
tive purposes yields values in the 
range .91-99, Ford's (1942) data are 
in general agreement with Upton's 
results. Mills (1958) computed 
limens for interaural differences in 
loudness on the basis of angular 
localization data (all at 50 db. SL) 
and obtained DLs considerably 
smaller than those of either Upton or 
Ford (.3 db. [BIR =.99] at 1,000 cps) 
to a maximum of 1.4 [BIR —.97] db. 
at 8,000 cps and a dip to 10,000 cps). 
These data were obtained at fre- 
quencies where phase cues were pre- 
sumed to be minimal in influencing 
the localization judgments. In a 
later study Mills (1959, 1960) deter- 
mined thresholds for interaural loud- 
ness differences directly, using the 
method of single stimuli. The thresh- 
olds obtained were larger than those 
computed on the basis of localization 
results (the maximal value converted 
to BIR was .96—2 db.—at 1,000 cps 
and 50 db. SL; two octaves above or 
below 1,000 cps the values were about 
.98). Analogous thresholds obtained 
by the method of constant stimuli 
were about one-half as large, and 
therefore more compatible with li- 
menscomputed from angular localiza- 
tion data. The change in binaural 
sound pressure ratio described by 
Firestone (1930) under nearly optimal 
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conditions (100? azimuth at 256 cps) 
is from .22 at 20 cm. to .50 at 50 cm. 
It would appear that under these con- 
ditions binaural differences in sound 
pressure could easily provide an ac- 
curate distance cue. The distance 
change necessary to produce a de- 
tectable change in binaural pressure 
ratio increases as (a) azimuth departs 
from 90°, and/or (b) frequency in- 
creases, and (c) absolute distance 
from the observer increases. It is 
doubtful that binaural pressure can 
be a useful cue to distance beyond 
approximately 15 feet with optimum 
azimuth and frequency (see Fire- 
stone, 1930). 

Greater changes in distance than 
are needed to produce detectable 
BIRs are, in general, required to pro- 
duce binaural phase differences that 
are detectable by the ear. Zwislocki 
and Feldman (1956), using dichotic 
stimuli, have shown that the jnd for 
phase increases from about Zid at 
250 cycles to about 6° at 1,000 cycles 
and about 11° at 1,300 cycles 
(SL=50 db.). Determinations by 
Klumpp and Eady (1956) and Mills 
(1958) (computed from angular locali- 
zation results) are in essential agree- 
ment with these data. The use of 
clicks, pulses, or noise instead of pure 
tones greatly alters the justnoticeable 
interaural time difference as do 
changes in SL and duration of the 
stimuli (e.g., Klumpp & Eady, 1956; 
Tobias & Zerlin, 1959), Looking at 
the phase data of Firestone (1930) 
in terms of the Zwislocki and 
Feldman (1956) data provides the 
approximate data of Table 4. 

At frequencies above 1,200 to 1,500 
cycles, phase differences are no longer 
useful. Phase differences also ap- 
proach zero as azimuth approaches 0? 
or 180° and as distance increases. It 
thusappears that under certain condi- 
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TABLE 4 


CHANGE IN BINAURAL PHASE 
DIFFERENCE IN JND 


— 
Distance Frequency 
change ————————— 
(at 90?) 256 cycles — 1,024 cycles 
20-50 cm. 2.4 7.0 
50-100 1.6 Zot 
100-400 1.2 1.5 


tions binaural phase differences may 
provide distance information. 
Although there seems to be evi- 
dence to indicate that binaural differ- 
ences resulting from distance changes 
should be detectable by the ear, it is 
still not known whether this binaural 
difference information is translated 
by the listener into distance judg- 
ments. Wightman and Firestone 
(1930) attempted to investigate this 
problem psychophysically. A 256- 
cycle tone was presented to subjects 
through earphonesand binaural phase 
and sound pressure were varied. 
"Impossible" combinations of bin- 
aural phase and BIR were presented 
as well as “possible” combinations. 
Subjects were asked to report ap- 
parent distance and apparent azi- 
muth. It was predicted by the experi- 
menters that subjects would deter- 
mine azimuth on the basis of binaural 
phase and utilize the BIR informa- 
tion to make distance judgments. 
The distance judgments did not 
conform to this prediction. It is not 
possible to determine whether the 
distance judgments were made on the 
basis of both cues presented since the 
two cues were often incompatible in 
terms of naturally occurring combi- 
nations of values. No distance judg- 
ments can be accurately predicted 
under these circumstances, although 
Hartley and Fry (1921) suggested 
that location judgments based on 
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conflicting cues will (a) tend to 
minimize adjustments in phase and 
intensity necessary to achieve a real 
source location and (5) utilize certain 
prejudices of thesubject for particular 
distances or azimuth locations. The 
distance judgments made by the sub- 
jects bore no apparent relation to the 
stimulus conditions, most judgments 
being "close." Sandel, Teas, Feder- 
son, and Jeffress (1955) have dem- 
onstrated that under certain condi- 
tions of conflicting time and intensity 
cues, apparent azimuth is erratic. 
Wightman and Firestone concluded 
that binaural cues to distance are not 
effective. In view of the confused cue 
situation presented to the subjects, 
however, it seems more reasonable to 
conclude that such a general state- 
ment concerning the effectiveness of 
binaural cues to distance is not 
warranted by these data. 

The possibility that  binaural 
cues are effective in distance percep- 
tion is indicated by the psychophysi- 
cal data demonstrating ability to 
localize in depth when loudness and 
frequency spectrum cues may have 
been unavailable. The studies of 
Mohrmann (1939) and Fletcher 
(1957), in which subjects were ap- 
parently able to localize tonal stimuli 
in the presence of misleading loud- 
ness cues, provide such data. Arps 
and Klemm (1913), on the basis of 
results from one subject, suggest that 
BIR may be a factor in distance 
localization, since the subject could 
localize with loudness and timbre cues 
removed. Itis not, however, possible 
to build a strong case on the basis of 
these results since the extent to which 
distortions may have allowed fre- 
quency spectrum to act as a cue is not 
known. 

Contrary to the situation prevail- 
ing with respect to the monaural cues 


it seems possible that the binaural 
cues may under certain circumstances 
permit absolute judgments. Whereas 
neither monaural intensity nor mon- 
aural frequency spectrum uniquely 
signifies distance (rather than loud- 
ness or timbre) the binaural cues 
provide a unique set of values for a 
portion of space. These values are not 
altered by changes in the stimulus 
other than source distance. Of 
course, these cues do not operate in 
the midline, and their detectabilityby 
the ear appears limited both in terms 
of distance and stimulus character- 
istics. 


SUMMARY 


This paper has discussed five 
stimulus characteristics that may 
provide cues to auditory distance: 
(1/R) loss, frequency spectrum at 
near distances, frequency spectrum at 
far distances, binaural intensity and 
binaural phase, or time differences. 

Others have been alluded to: re- 
flected sound waves, binaural differ- 
ences in frequency spectrum, and 
pinna effects. It has been pointed out 
that all the possible cues discussed, 
with the exception of the (1/2) loss, 
are either usable only at near dis- 
tances or may provide confusing 
information (e.g., frequency spectrum 
at far distances). The cue situation 
apparently provides more reliable 
information at near distances than at 
far distances, Considered in toto the 
perception of auditory distance seems 
to rely on a complex cue situation 
which presents some interesting anal- 
ogies with visual depth perception. 

We may conclude that: 

1. Loudness appears to be well 
established as a cue to auditory depth 
perception. There is, however, a dis- 
crepancy between the experimentally 
determined DL for distance and the 
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DL for distance predicted on the 
basis of the DL for loudness. Deter- 
minations of the distance DL in 
dynamic (moving source or moving 
observer) situations are needed. 

2. Although there is a general 
agreement that frequency spectrum 
is a cue in auditory depth perception, 
there is controversy as to the manner 
in which this cue influences distance 
judgments. It was suggested that 
this controversy has no basis in fact, 
and could be resolved by con- 
sidering frequency spectrum to oper- 
ate as a cue in one manner at near 
distances and in another manner at 
far distances, 

3. Physical acoustics has indicated 
the possibility that there may be 
binaural cues to distance within a 
restricted distance range. This possi- 
bility has not yet been adequately 
tested in a psychophysical experi- 
ment. Possible pinna effects have 
also not yet been evaluated. 
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BY PSYCHOLOGICAL TESTS 
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Approaches to the diagnosis of organic brain damage reflect diverse 
concepts of the nature of brain damage and of its behavioral effects, 


designed on the assumption of a unitary concept of organicity, the evi- 
dence supports the conclusion that brain damage is a multidimensional 


Historically, the psychological as- 
sessment of brain damage reveals 
many diverse paths which often seem 
contradictory. The literature on this 
problem reflects a myriad of psycho- 
logical concepts of brain damage, 
with many meanings.and ramifica- 
tions. Both naive practice and the 
overwhelming bulk of published re- 
search appear to accept the term 
brain damage as a unitary diagnostic 
entity, although the predominant 
evidence indicates that it represents 
a complex and multifaceted category, 


DiAGNosrIC SIGN APPROACH 


The first approach to diagnosing 
brain damage was the psychiatric 
method based primarily upon 
behavioral symptomatology. This 
method continues to be popular and 
psychiatric classification is a frequent 
validity criterion jn psychometric 
studies. 
classifications as criterion has led to 
varied, subjective, and often contra- 
dictory results. In reviewing “or. 
ganic” signs on the Rorschach, 
Eckhardt (1961) indicated that or- 
ganic signs may be related to geo- 
graphical location, for the criteria for 
diagnostic categories differs from 
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This use of psychiatric ' 


place to place. Other studies (Frank, 
Corrie, & Fogel, 1955; Wittenborn, 
1952) support Eckhardt's views, for 
the use of behavioral symptoms as a 
basis for classification is usually 
descriptive of the symptoms but not 
necessarily related to their etiology. 

Such results bring into focus one of 
the basic issues in brain damage re- 
search, the problem of definition, 
which depends in turn on theory. In 
1954, Yates pointed out that most of 
the theories of brain damage function- 
ing hypothesize one or another single 
classification of brain damage, and, 
therefore, they lead to erroneous 
conclusions in test construction and 
interpretation, Subsequently Reitan 
(1955d, 1955f, 1958b, 1959b) re- 
Ported several studies in which 
emphasis was placed on relating test 
Performance to brain functions in 
which the nature and extent of pathol- 
ogy were carefully determined. Al- 
though this approach has yielded 
important information indicating the 
possibility of predicting the existence 
of brain damage from empirically 
selected psychological tests, the ac- 
curacy of classification is high for 
groups, but imperfect for individual 
diagnosis. The residual variance in 


ASSESSMENT OF ORGANIC BRAIN DAMAGE 


the test scores is attributed to factors 
other than organic brain damage. 


SiNGLE VARIABLE TEsTS 


Frequent attempts have been made 
to construct tests for brain damage, 
taking one aspect of behavior as a 
basis for diagnosis. Representative 
examples include the Bender Gestalt 
Test, the Memory-for-Designs test, 
and the Spiral Aftereffect test, to 
name just a few. Although such tests 
frequently involve several factors 
such as vision, psychomotor coordi- 
nation, and memory, they all assume 
an inherent quality which differenti- 
ates the performance of brain dam- 
aged subjects from non-brain-dam- 
aged subjects. 

Studies using the Spiral Aftereffect 
test (Davids, Goldenberg, & Laufer, 
1957; Page, Rakita, Kaplan, & Smith, 
1957; Philbrick, 1959; Stilson, Gyn- 
ther, & Gertz, 1957), the Bender 
Gestalt (Goldberg, 1959; Griffith & 
Taylor, 1961; Hannah, 1958; Hanvik 
& Andersen, 1950; Lonstein, 1954; 
Mehlman & Vatorec, 1956; Tolor, 
1956), and the Memory-for-Designs 
test (Hovey, 1961; Howard & Shoe- 
maker, 1954; Wahler, 1956) have 
revealed two major findings: 

First, although these tests demon- 
strate a certain amount of diagnostic 
success, they frequently identify too 
many false positives and false nega- 
tives, and differentiation between 
groups is usually gross. As a result, 
many significant findings are errone- 
ous when applied to individual cases. 
The usefulness of single variable tests 
may be enhanced, however, when 
they are included in test batteries 
which integrate specific measures 
into more molar and predictively 
more meaningful composites. 

The second major finding is that 
the basis for diagnosis rests on tenu- 
ous assumptions, since it has been 
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reliably demonstrated (Griffith & 
Taylor, 1961; Hannah, 1958; Hovey, 
1961; Shapiro, 1951, 1952, 1953; 
Williams, Lubin, Gieseking, & Rubin- 
stein, 1956) that the diagnostic signs 
may be properties of the stimulus. 
Before any deviations in performance 
are attributed to organismic variables, 
the stimulus properties and the na- 
ture of the task need to be syste- 
matically investigated. 


SELECTION or SUBJECTS 


One of the most critical problems in 
brain damage research is that of 
selection of subjects. Although 
numerous studies have differentiated 
between organic and nonorganic pa- 
tients, the basis for differentiation 
often loses its potency when psy- 
chotic subjects (e.g., schizophrenics) 
are utilized, for then the underlying 
test assumptions are not restricted to 
brain damage alone. Where psy- 
chotic subjects have been used, 
variables such as age and sex have 
frequently been uncontrolled. Young 
and Pitts (1951) have shown that 
racial factors can also be significant - 
variables. This finding may also be 
linked with socioeconomic back- 
ground. 

The selection of controls is a major 
consideration, but the proper selec- 
tion of brain damaged subjects is 
equally important. For example, 
differential effects in performance 
have been found between left and 
right cerebral lesions (Heimburger & 
Reitan, 1961; Reitan, 1955b; Reitan 
& Tarshes, 1959). ^ Furthermore, 
Reitan (1959c) demonstrated a real 
possibility for the existence of differ- 
ent types of pathological brain in- 
volvement which may have different | 
psychological effects. He reported 
significant agreement among infer- 
ences based on neuropsychological 
data and neurological data for the 
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following categories: (a) diffuse, focal, 
focal and diffuse, or bilaterally focal 
cerebral damage; (b) right or left 
cerebral hemisphere involvement; (c) 
lobular localization of areas of maxi- 
mal involvement within each hemi- 
sphere; (d) static, slowly progressive, 
moderately progressive, or rapidly 
progressive character of the lesion; 
and (e) lesion categories including 
cerebral vascular disease, tumor, 
degenerative or demyelinating dis- 
ease, inflammatory or infectious dis- 
ease, and trauma. 

Nevertheless, the indiscriminate 
acceptance of brain damaged subjects 
as homogeneous has been reflected in 
the unrestricted use of varied types 
of brain damage in single samples. 
Undoubtedly this erroneous practice 
is responsible for a large portion of 
the variancein contradictory findings. 
There is no longer any justification 
for this unitary concept. A consider- 
ableaccumulation of evidence (Ander- 
sen, 1951; Andersen & Hanvik, 1950; 
Belmont & Birch, 1960; Doehring & 
Reitan, 1960; Fisher, 1958; Kooi, 
Boswell, & Thomas, 1958; McGaugh- 
ran & Moran, 1957; McMurray, 
1954; Reitan, 19542, 1955a; Semmes, 
Weinstein, & Teuber, 1954; Sindberg, 
1961) indicates that there are many 
variations in the performance of brain 
damaged individuals with many con- 
tributory influences, 


SCATTER PATTERNS 


In modest contrast to the use of 
single tests, scatter patterns on the 
Wechsler Bellevue Intelligence Scale 
have been employed in differential 
diagnoses with varying degrees of suc- 
cess. The general conclusion, how- 
ever, is unfavorable for the use of the 
Wechsler scatter patterns in diag- 
nostic research (Allen, 1949; Cohen, 
1955; Gjesvik, 1957; Jastak, 1953; 
Wittenborn & Holtzberg, 1951). In 
addition, two factor analytic studies, 
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by Cohen (1952a, 1952b), demon- 
strated that the Arithmetic, Picture 
Arrangement, Block Design, Digit 
Symbol, and Picture Completion 
subtests did not measure the same 
common factor in three different 
diagnostic groups. It was, therefore, 
necessary to know the patient's diag- 
nosis in order to figure out what factor 
a subtest was measuring. 

One factor that may explain the 
apparent loss of diagnostic potency of 
Wechsler subtest patterns is the as- 
sumption of unitary brain damage in 
assembling experimental groups; this 
could produce sufficient variation 
among subjects to cancel meaningful 
differences among groups. It has 
been demonstrated, for example, that 
right hemispheric lesions appear to 
produce different effects from left 


‘hemispheric lesions on the Wechsler- 


Bellevue (Hallgrim & Reitan, 1958). 

A modification of subtest analysis 
was offered by Hewson (19492) in 
which ratios between combinations of 
subtests were used in establishing 
critical values for diagnostic differ- 
entiation. A later study conducted by 
Bryan and Brown (1957) reported 
only 8% false negatives with the use 
of Hewson's ratios. Further valida- 
tion is needed for this promising ap- 
proach. An advantage of Hewson's 
method is that it does notdepend upon 
the assumption of a typical scatter 
pattern for brain damaged subjects. 

Reitan (19593) has shown that the 
Halstead Impairment Index appears 
to be more sensitive than the Wechs- 
ler-Bellevue scale in detecting brain 
damage. The differences between 
these two instruments in this regard, 
appear to reflect the different assump- 
tions underlying their construction. 


QUALITATIVE APPROACHES 


Another popular approach in as- 
sessing brain damage may be de- 
scribed as qualitative, both because 
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of the implied assumption that corti- 
cal malfunctioning results in qualita- 
tive changes in behavior patterns, 
and because the testing methods 
employed are extremely‘‘qualitative”’ 
or global in procedure. With regard 
to the assumption of qualitative 
changes in brain damaged subjects, 
Reitan (1958a, 1959b) has demon- 
strated that the abilities of brain 
damaged subjects are measurable on 
the same continua and interrelated 
comparably to those of non-brain- 
damaged subjects. Since the abilities 
which are impaired in the brain dam- 
aged subjects appear to be the same 
as those found in non-brain-damaged 
subjects, the difference is quantita- 
tive rather than qualitative. 

The global approach is typified by 
the Proverbs test, the Rorschach, 
and the Goldstein tests. Although 
the Rorschach uses a scoring system, 
the methods of scoring do not possess 
the stringent specification of quanti- 
tative  psychometrics, and the 
reliability of these subjective meth- 
ods is frequently insufficient. Reitan 
(1954b, 1955c, 1955e, 1955g) reported 
some success in diagnosing brain 
damage with the Rorschach, but re- 
garded it as inferior to other available 
methods. 

Whenever reliability and validity 
are reported to be high, this usually 
appears to be a function of the 
clinician standardizing himself, and 
as a result interindividual reliability 
is usually poor. This is not conclusive 
evidence that qualitative methods 
are of no use, but it does argue for 
caution in employing these diag- 
nostic approaches. 

Although the Bender Gestalt Test 
is a qualitative technique, it differs 
from the Proverbs test, the Ror- 
schach, and Goldstein tests in some 
of the underlying assumptions of 
abstraction. These differences have 
led to controversies concerning the 
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neurophysiological basis for abstrac- 
tion, as well as other forms of higher 
order behaviors, and to the conflict 
over general versus specific cortical 
functioning. Consideration of the 
problems of postulating a neuro- 
physiological basis for abstraction is 
necessary to bring the broad problem 
of brain damage research into per- 
spective. This will serve to point out 
new areas of conflict as well as to 
reiterate previously mentioned prob- 
lems. 


NEUROPHYSIOLOGICAL 
CONSIDERATIONS 


Before discussing the basis of ab- 
stractions, some observations of a 
semantic nature appear justified con- 
cerning the term abstract. Although 
the word is rooted in the Latin 
abstractus, which means drawn away, 
modern usage is multifaceted, as 
evidenced by at least 13 connotations 
attributed to the word abstract. This 
semantic fluidity has demonstrable 
implications on communicative proc- 
esses and hence upon a unified basis 
for understanding, which is experi- 
enced most acutely when trying to 
isolate a neuroanatomical site for the 
nebulous concept. 

One of the major difficulties in 
trying to construct à unified or uni- 
dimensional definition of abstraction 
is the inability to anchor the term ina 
tangible entity, for the inherent na- 
ture of abstraction removes it from a 
specific or concrete point of reference, 
Goldstein referred to an abstract 
attitude which is antithetical to 
concreteness. He stated that a con- 
crete attitude is concerned merely 
with the stimulus and its attributes, 
but “The abstract attitude embraces 
more than merely the ‘real’ stimulus 
in its scope” (Goldstein & Scheerer, 
1941). This interpretation reflects 
the Latin origin, for the abstract 
meaning is "drawn away” from the 
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original anchor point and assumes 
new dimensions of a symbolic nature. 
Abstraction thus becomes a symbolic 
process which has no specific point of 
reference, but rather constitutes a 
classification of general elements. 
These elements form the molar com- 
posites of multifaceted concept cate- 
gories. It is within this framework of 
generalization that the concept of 
abstraction is usually employed in 
psychological discourse. 

As a process of generalization, ab- 
straction has been conceived as a 
restrictive phylogenetic function with 
a one to one relationship with corti- 
calization. Thus, the most complex 
aspects or major portions of abstrac- 
tion are said to reside in the most 
intricate of sentient creatures, Homo 
sapiens. This does not imply, how- 
ever, that some of the lower species 
are not able to abstract nor does it 
devaluate the role of animal studies. 

Ascending the phylogenetic scale, 
the increasing magnitude of problem 
solving, adaptation, and responding 
is readily noticeable, with the highest 
level attained by man. This observa- 
tion is in keeping with the assumption 
of a relationship between the ability 
to abstract and the phylogenetic 
level, and this congruity has led to 
the belief that corticalization is the 
determining force in the ability to 
abstract. 

The implied relationship between 
the cortex and abstraction has been 
considered insufficient by many in- 
vestigators, however, and a greater 
degree of refinement or localization 
has been sought. This desire to ascer- 
tain meaningful data presented a 
methodological quandary for ethical 

restrictions prevented unlimited ma- 
nipulation of human subjects, and 
this situation gave direction to two 
diverse paths of research. One path 
led to experimental ablation of tissue 
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in infrahuman subjects; the other 
method assumed the form of syste- 
matic observation of individuals af- 
flicted with brain disease or injury, 
Both paths gave credence to the 
postulation that abstraction was 
associated with the frontal lobes. 
However, this belief has been ques- 
tioned by Hebb (1949) and others, 
and there is an abundance of con- 
flicting literature awaiting clarifica- 
tion and organization of the frag- 
mentary findings. 

This situation has obvious implica- 
tions for the problem of definition of 
brain damage. Test construction for 
measurement of abstract functioning 
depends upon a definition of the 
mediation process, and at present 
lacks a concrete neuroanatomical 
reference. Having postulated a con- 
cept of generalizing about some prop- 
erties of groups of stimuli, the method 
of transmitting the information comes 
to the fore. Which of the sensory 
modalities is best for assessing this 
function? Semmes, Weinstein, and 
Teuber (1954) found that the order of 
Presentation of visual and tactual 
tasks was important for certain lesion 
locations. Also, patients with tem- 
poral lesions performed most poorly 
on tactual tasks. Maier and Sabom 
(1937), in a study of the effect of 
lesion shape in rats on "reasoning" 
tasks, found that the shape of the 
lesion was an important character- 
istic in the alteration of behavior. 
Regardless of the direction of the 
long axis in the oblong lesions, round 
lesions were found to create the most 
deterioration, These two studies are 
representative of a number which 
underline the multiplicity of factors 
involved in neural functioning, which 
must be accounted for in any efforts 
to assess the effects of organismic 
variation on some a priori criterion. 

It can be seen that many of the 
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discrepancies in the literature are 
ramifications of the futile effort of 
trying to draw parallels between non- 
parallels. The net effect is that a 
variety of methods of assessment and 
heterogeneous groups of subjects 
have been utilized in trying to obtain 
a unified concept for the neurological 
basis for abstraction. Many investi- 
gators have rendered only lip service 
to the multivariate concept of ab- 
straction as consisting of gradients 
along a unidimensional scale. How- 
ever, McGaughran and Moran (1957) 
have demonstrated that there are 
differentiable aspects of abstraction 
which render it a multivariate con- 
cept with many vectors. Their con- 
clusion is congruent with the previ- 
ously cited studies and the evidence 
appears clear that abstraction must 
be considered as a multidimensional 
function with many modes of expres- 
sion. 

Can a multidimensional function 
such as abstraction be limited to one 
neural area such as the frontal lobes? 
Tilney (1930) believed that all higher 
mental functions are located in the 
frontal lobes. More recently, how- 
ever, this sweeping generalization 
has been challenged. In an attempt 
to achieve an answer to this question 
which would be devoid of extremism, 
Landis, Zubin, and Mettler (1950) 
evaluated reports of frontal lobe 
surgery. There were many varied 
reactions which were usually transi- 
tory in nature, but there was no 
evidence of deterioration of higher 
functions after a relatively brief span 
of time. This and other work, such as 
Lashley's (1933), lends support to the 
concept of equipotentiality, but the 
evidence is still inconclusive. Fur- 
ther evidence against the frontal lobe 
paradigm was offered by Morrow and 
Mark (1955) in a correlational ap- 
proach to neurological findings and 
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intelligence. They utilized the au- 
topsy results on 22 brain damaged 
patients, and found less psychometric 
deficit when the lesions were anterior 
to the Rolandic fissure than when the 
lesions were posteriorly situated, In 
addition, Hanvik and Andersen 
(1950) failed to find significant rela- 
tions between focal and diffuse lesions. 

One problem in assessing the effect 
of locus is the tendency to isolate 
this influence from other contributory 
forces within the resulting pathology. 
The obvious aspects of this situation 
are revealed by the investigation of 
Fitzhugh, Fitzhugh, and  Reitan 
(1961) which was instigated by Yates' 
(1954) criticism of the aberrant con- 
cept of brain damage as a unitary 
factor. Fitzhugh, Fitzhugh, and 
Reitan found a significant dimension 
on a continuum of chronicity of 
acuteness of brain disease. Prior to 
their research Belmont and Birch 
(1960) demonstrated the importance 
of the temporal element of assess- 
ment: the age at which a neurological 
insult was received influenced the 
recorded performance significantly. 
This evidence adds impetus to the 
multiple aspects of assessing abstrac- 
tion, but also adds complexity to the 
problems of measurement. 


DISCUSSION 


It seems apparent that no one ap- 
proach to the problem of brain dam- 
age is sufficient, but rather a global or 
molar attempt at interrelating the 
diverse facets is indicated. Although 
abstraction is a unitary term, it is a 
multivariate concept and must be 
treated as such. More systematic 
research needs to be conducted which 
considers age, locus, and definition 
along with the type of sensory modal- 
ity utilized in the tasks of abstraction. 
In addition to these factors, the 
integrative functioning of the nervous 
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system cannot be denied, for although 
there may be some specificity of brain 
areas, functional ^ heterogeneity 
coupled with functional association 
remain as prominent influences. 
There does appear to be a relation- 
ship between cortical functioning and 
abstraction, but this relationship does 
not preclude other neurological influ- 
ences, and it appears premature and 
possibly fallacious to postulate the 
unequivocal role of the frontal lobes 
in abstraction at the present state of 
our knowledge. Statements such as 
that of Halstead (1947), that the 
frontal lobes "are the organs of 
civilization," must be regarded as 
speculation rather than scientific fact. 
Two general factors stand out for 
future investigation of brain damage: 
the method of assessment and the 
purpose of assessment. An important 
aspect of method is the use of ap- 
propriate statistical techniques. For 
example, Stein (1961) in a study of 
the effects of three variables on visual 
motor functioning wasable to increase 
the percentage of correct classifica- 
tions with the use of B coefficients. 
This is a major step toward the use of 
multiple correlation techniques. An- 
other promising approach is the use of 
discriminant functions to generate 
optimal weights for discrimination of 
diagnostic groups. In his review of 
“Psychological Deficit,” prepared for 
a forthcoming Annual Review of 
Psychology, Reitan called attention to 
the work of Harper (1950a, 1950b), 
Pichot and Perse (1952), and the 
symposium edited by Wheeler (1961) 
as illustrative of this approach. Since 
there are many facets to brain dam- 
age, such as age, locus of lesion, and 
extent of damage, multivariate ap- 
proaches are needed to account for 
the diverse performances of brain 
damaged subjects. 
The fruitfulness of the multiple- 


measurement approach for the as- 
sessment of the diverse facets of or- 
ganic brain damage has been demon- 
strated most impressively by the 
contributions of the Neuropsychology 
Laboratory of the University of 
Indiana Medical Center, summarized 
by Reitan (1959c). Over the past 8 
years a standard test battery (includ- 
ing Halstead’s Neuropsychological 
Test Battery, the Trail Making test, 
a modification of the Halstead-Wep- 
man Aphasia Screening test, the 
Wechsler-Bellevue Intelligence Scale, 
Form I, and the Minnesota Multi- 
phasic Personality Inventory) has 
been administered to over 2,000 pa- 
tients for whom detailed neurological 
and neurosurgical brain descriptions 
were also systematically obtained to 
provide accurate neurological criteria 
for research. Reitan’s research pro- 
gram, aimed toward the establishment 
of principles for use in evaluating 
human brain functions, has not only 
produced a large number of highly 
significant diagnostic results, but also 
emphasizes the importance of neuro- 
anatomy, neurology, and neuropa- 
thology in interpreting the differential 
nature of brain lesions from psycho- 
logical measurements. 

Not only is the required approach 
multivariate but the nature of the 
diagnostic problem involved has been 
characterized by Becker! as requiring 
a sequential decision making model. 
The major initial decision, according 
to Becker, would be whether or not 
brain damage is present. Following 
this, further decisions would be re- 
quired to determine whether it is 
diffuse, multiple focal, or focal in na- 
ture. If diffuse, localization of maxi- 
mal involvement would be next; i.e., 


* The authors are indebted to Wesley C. 
Becker for his criticism of the manuscript and 
his suggestions, particularly on this point. 
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by hemisphere, lobe, and so on. At 
this point it would be possible to 
select the most appropriate etiologic 
hypotheses and to test them. In addi- 
tion, it may be important to deter- 
mine whether the lesion is progressive 
or static in relation to the observed 
etiology. This model, as Becker 
points out, “is multi-dimensional in 
every sense of the word, but it is 
hierarchically ordered such that there 
are sequential dependencies among 
the various dimensions." 

The question of purpose is, of 
course, fundamental and should per- 
haps be reconsidered in a frame of 
reference of cooperative interaction 
between clinicians and scientists. 
New knowledge indicates the aban- 
donment of simple categories and 
acceptance of more complex, quanti- 
tative continua. 

There are many factors other than 
brain damage, representative of both 
the integrated personality and effects 
of trauma, illness, or stress, which can 
influence behavior. Yates (1954) and 
Meyer (1961) suggested that the goal 
should be to assess the amount of 
brain damage along with other be- 
havioral dimensions such as psychoti- 
cism and neuroticism. This approach 
need not, however, treat brain dam- 
age as a unitary factor. 


CONCLUSION 


The foregoing review of the prob- 
lem and of the relevant literature 
emphasizes the need to treat the be- 
havioral effects of brain damage as a 
sequentially ordered, multidimen- 
sional problem. A unitary score for 
brain damage would conceal its 
multivariate nature. In addition to 
assessing the amount of brain dam- 
age, the required approach would also 
make it possible to discover the loca- 
tion and kind of brain damage in- 
volved in particular cases. 
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A COMMENT ON “A PARADIGM FOR DETERMINING THE > 


CLINICAL RELEVANCE OF HYPNOTICALLY 
INDUCED PSYCHOPATHOLOGY” 


EUGENE E. LEVITT 
Indiana University Medical Center 


A critique of the paper by Reyher in which he sets forth a model for 
experiments in the hypnotic induction of emotional states, and attacks 
all earlier research which fails to conform to this model. It is pointed 
out that Reyher's approach assumes that various psychoanalytic hy- 
potheses of the origins of emotional states are empirically verified. He 
supposes erroneously that hypnotically suggesting a state directly 
always specifies the behavior which should follow from that state. He 
assumes, without empirical evidence, that suggesting a state in a “pure” 
form renders consequent experimental results inapplicable to clinical 
practice. Reyher's rejection of the earlier work, as well as much of his 
claim for his model, is found to be without adequate basis. 


In a recent paper, Reyher (1962) 
casually tosses “all previous research 
in the hypnotic induction of psycho- 
pathology" into the waste basket be- 
cause it fails to conform to his ideal 
experimental paradigm. He directs 
his major fire against studies like that 
of Levitt, den Breeijen, and Persky 
(1960), in which the existence of an 
emotional state is directly suggested 
to the hypnotized subject. The de- 
sign which Reyher recommends is 
one in which hypnosis is used to 
create an artificial conflict in the sub- 
ject, which is then repressed (hyp- 
notically induced amnesia), and fi- 
nally reactivated in the waking state 
by an associated stimulus. 

Reyher's sweeping generalization 
is not supported by either logical or 
empirical analysis. Neither is the 
case for the research superiority of his 
paradigm. Reyher has simply made 
some crucial methodological blunders, 
as I shall attempt to demonstrate, 


The Confusion of Theoretical with 
Empirical 

According to Reyher's analysis, his 
paradigm is valid because the stimu- 
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lus which is used to induce the state is 
structurally most similar to the 
etiology of the state as it occurs nat- 
urally. 

What Reyher has done is to con- 
fuse theoretical with empirical, and 
this confusion is reflected throughout 
his analysis. He accepts the psycho- 
analytic hypothesis—repressed con- 
flict—anxiety—psychopathological 
symptom—as an empirically demon- 
strated fact. He assumes, therefore, 
that any behavior which follows a 
repressed conflict is, in fact, a mani- 
festation of anxiety, even though it is 
identifiable as depressed behavior, 
hostile behavior, etc. These are 
simply ways in which anxiety is 
“managed defensively." The stimu- 
lus is so paramount in Reyher's 
definition that the response is prac- 
tically immaterial, as long as it can be 
included in "some classification. of 
psychopathology.” 2 

Reyher's reasoning may be satis- 
factory to the psychoanalytic theo- 
rist, but one does not need to be par- 
ticularly hard-nosed to reject it. 
When most experimentalists and 
clinical practitioners use the term 
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* "anxiety" they are referring pri- 

t marily to behavior: verbal, motoric, 
and physiological. This behavior is 
manifest and measurable with con- 
siderable reliability. The psycho- 
dynamic origin of the behavior is 
vague, slippery, subject to retro- 
spective distortion, and otherwise 
extremely difficult to determine, let 
alone to measure. It is hardly suit- 
able for an experimental construct 
definition, 

Reyher's confusion is similarly 
manifest in the suggested use of his 
paradigm for inducing depression. 
He assumes, a priori, that the hy- 
pothesis which states that the re- 
moval of “objects and symbols for 
the gratification of important emo- 
tional needs" will lead to depression, 
has been empirically verified. Again, 
the nature of the subject's actual re- 
active behavior is immaterial. The 
definition of the construct has already 
been specified by the stimulus which 
the hypothesis requires. 

Now, Reyher is on safe ground 
when he says that his paradigm can 
be used to test a hypothesis concern- 
ing the etiology of a psychopatho- 
logical condition. But this is not the 
same as studying the condition itself. 
A test of the hypothesis that Conflict 
X leads to Behavior Y does not re- 
quire that most subjects respond with 
Behavior Y. If they do not, we con- 
clude simply that the hypothesis is 
invalid. 

If we wish to study Behavior Y, 
however, we must evoke Behavior Y. 
A stimulus which leads to many other 
forms of behavior in addition to Y is 
evidently not satisfactory, unless you 
make some sort of unsubstantiated 
assumption, as Reyher does. 

Reyher's scornful dismissal of the 
direct suggestion technique employed 
by Levitt, den Breeijen, and Persky 
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(1960) and others follows from his 
confusion of theoretical and empirical. 
His evaluation is based not on in- 
duced behavior, but solely on the 
stimulus, which does not conform to 
his hypothesis of how anxiety is gen- 
erated, He must, therefore, overlook 
the predictable anxious behaviors 
which have so far been obtained by 
the Levitt, den Breeijen, and Persky 
technique on the following measures: 
clinical judgment of manifest be- 
havior, Rorschach, TAT, Manifest 
Anxiety scale, IPAT Anxiety scale, 
Affect Adjective Check List, Barron 
Ego-Strength scale, pulse rate, and 
plasma hydrocortisone level (Grosz & 
Levitt, 1959; Levitt, den Breeijen, & 
Persky, 1960; Levitt & Grosz, 1960; 
Levitt & Persky, 1960; Persky, Grosz, 
Norton, & McMurtry, 1959; Zucker- 
man, 1960). 


Confusing the Suggesting of a Slate 
with Specifying Behavior 

Reyher criticizes the technique 
used by Levitt, den Breeijen, and 
Persky because the “S merely carries 
out the suggestions that are given to 
him; the instructions specify the 
behavior." The question of whether 
or not the suggestions specify the 
behavior must be considered in light 
of the specific suggestions, and the 
specific behaviors which are subse- 
quently measured. Thus, if the sub- 
ject is instructed to feel a certain way, 
the response measurements should 
include behaviors other than a report 
of the subjective feelings which were 
directly suggested. This is a well- 
worn axiom of hypnosis research. 

In the Levitt, den Breeijen, and 
Persky study, the subjects were in- 
structed to experience anxiety, fear, 
dread, apprehension, and panic; these 
were the exact words They | 
were not told how to react to the sug- 
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gested feeling. They were not in- 
structed to tremble or sweat, or to see 
a “nightmare” in TAT Card 19 where 
previously they had seen a ‘‘child- 
hood dream,” or to have a more 
rapid pulse rate. The situation did 
not “demand” that they respond 
with more shading percepts to the 
Rorschach, or with an elevated level 
of adrenocortical hormone in the 
plasma. It is patently nonsensical to 
contend that the hypnotic instruc- 
tions specified these behaviors. 


The Confusion of ‘Controlled’ with 
“Useless” 


Reyher criticizes the direct tech- 
nique because it attempts to study an 
emotion like anxiety in its “pure” 
form, i.e., relatively uncontaminated 
by other states. Since anxiety does 
not occur in a pure form in the human 
organism in its natural milieu, this 
"entirely destroys" the "clinical sig- 
nificance" of findings with this ap- 
proach (though it is “scientifically 
legitimate"—Aan interesting distinc- 
tion). 

It is probably true that anxiety 
occurs more frequently in a mixture 
with other emotional states, but it is 
not uncommon to find it in a rela- 
tively pure form, especially at a par- 
ticular moment in time, Even when 
the diagnosis is "anxiety with depres- 
sion," the latter is often found to 
obtrude periodically upon a more 
constant. anxiety state. However, 
even if anxiety never occurred alone, 
the notion that it is valueless to iso- 
late it for study is naively unscientific, 

What we have done is to attempt to 
experimentally control the factor of 
“other emotional states" so that we 

can study anxiety alone. In principle, 
the approach is no different from that 
of the experimenter who deliberately 
selects a sample which is homogeneous 
with respect to Variable X so that he 
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can investigate Variable Y, which is 
normally correlated with X. The’ 
approach does not ignore the correla- 
tion; it simply seeks to neutralize it 
for experimental purposes. There are 
many experimental situations in 
which such control is not only desir- 
able, but downright necessary. One 
does not need to abandon the con- 
cept of the total organism to do it. 
To maintain that this sort of control 
renders results meaningless is to flout 
the history of psychological science, 
as well as to deny scientific method 
itself. 

What Reyher presumably means 
by the absence of clinical significance 
is that studies of hypnotically induced 
psychopathology with methods other 
than his own will have no practical 
usefulness in the diagnosis and treat- 
ment of psychiatric patients. Reyher 
may be correct, but his claim must be 
settled pragmatically, not by arm- 
chair debate. Technique is discarded 
when it is demonstrated to be fruit- 
less, not because of a priori theo- 
retical considerations. From. this 
viewpoint, it is also possible to state 
that Reyher's ideal paradigm has not 
yet been shown to have clinical value. 

Reyher’s paradigm is promisingly 
ingenious though it has its limita- 
tions, not all of which have been 
mentioned in this brief note. His 
analysis of other research in this area 
is without adequate foundation. 
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A REPLY TO LEVITT'S COMMENTS 
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Levitt's critique indicates that he missed the main point of the paradigm, 
and that he misinterpreted a variety of other matters. An effort is 
made to clarify the issues involved by citing sections of the original 
article and by further discussion when necessary. Particular emphasis 
is placed upon the paradigm as a method for testing theories of psycho- 
pathology which assume genotypic-phenotypic relationships. The need 
for proper control groups is reiterated, and the paradigm is viewed as 
an attempt to bring the data of clinical and experimental methods 


closer into congruence. 


Levitt (1963) has severely criti- 
cized Reyher's (1962) paradigm for 
determining the clinical relevance of 
hypnotically induced psychopathol- 
ogy. In Levitt's zeal to defend direct 
suggestion as a scientifically useful 
method, he apparently did not grasp 
the main import of the paradigm and 
interpreted incorrectly a variety of 
other matters. He introduces his 
comments with a misinterpretation 
of Reyher's views and procedures: 
The design which Reyher recommends is one 
in which hypnosis is used to create an artificial 
conflict in the subject, which is then repressed 
(hypnotically induced amnesia), and finally 


reactivated in the waking state by an asso- 
ciated stimulus (p. 326). 


Reyher does not make a specific 
recommendation, but he does sug- 
gest that the activation of natural 
conflicts, not artificial conflicts, 
"would seem to have the greatest 
potential for creating psychopathol- 
ogy" (p. 349)  Reyher does not 
equate repression with a hypnotically 
induced amnesia, as Levitt alleges, 
but he does equate it with the failure 
of a subject to comply with a post- 
hypnotic suggestion to become aware 
of and act upon the hypnotically 
induced affect and impulses. 

Many of Levitt's criticisms are 
related to what he refers to as “The 
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Confusion of the Theoretical with the 
Empirical." Reyher does not state, 
as Levitt alleges, that the "paradigm 
is valid because the stimulus which is 
used to induce the state is structur- 
ally most similar to the etiology of 
the state as it occurs naturally" (p. 
326). The paradigm has four parts, 
none of which mention this similarity. 

The paradigm was set up specifi- 
cally to obviate confusion between 
the theoretical and the empirical. 
The genotype refers to a stimulus 
derived from theory, clinical lore, 
hunch, etc., and the phenotype refers 
to overt behavior. The paradigm is 
designed to separate mechanisms of 
suggestion from the behavioral out- 
come and includes a control group to 
assess the demand characteristics of 
the research. Of course, the paradigm 
is inappropriate in reference to any 
point of view that does not recognize 
the central significance of genotypic- 
phenotypic relationships in psycho- 
pathology. 

Levitt states that Reyher accepts 
the psychoanalytic hypothesis as an 
empirically ^ demonstrated fact. 
Again, he is incorrect. Reyher re- 
peatedly points out "the paradigm s 
à procedure for testing theories, 
psychoanalytic or otherwise. The 
paradigm does assume that geno- 
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typic-phenotypic relationships are of 
central significance in conceptualizing 
psychopathology, and it provides a 
method for investigating them ob- 
jectively. 
Levitt states, 

Reyher's confusion is similarly manifest in the 
suggested use of his paradigm for inducing de- 
pression. He assumes, a priori, that the hy- 
pothesis which states that the removal of 
"objects and symbols for the gratification of 
important emotional needs" will lead to de- 
pression, has been empirically verified (p. 327). 


Again, Levitt is incorrect. Reyher 
mentions that the foregoing etiology 
of depression is one of at least two 
"clinical models from which to 
choose." Models are not empirically 
verified relationships. 
Levitt states, 

A stimulus which leads to many other forms 
of behavior in addition to Y is evidently not 
satisfactory, unless you make some sort of un- 


substantiated assumption, as Reyher does 
(p. 327). 


On the contrary, such a stimulus 
could be very satisfactory if Y and 
other reactions are related to the 
stimulus through a second variable, 
Z. Reyher (1961) has reported that 
the type of reaction is related to the 
degree of spontaneous posthypnotic 
repression (Variable Z). Further- 
more, a stimulus which leads to a 
variety of reactions or symptoms is 
valuable because of the opportunity 
which is presented for correlating 
specific symptoms with measurable 
aspects of personality. The subjects 
who share a given symptom may also 
have in common a distinct constella- 
tion of needs, defenses, and traits. 

Levitt further criticizes Reyher for 
“Confusing the Suggesting of a State 
with Specifying Behavior” and “The 
Confusion of ‘Controlled’ with ‘Use- 
less.’ " In these sections, he defends 
the use of direct suggestion. Unfor- 
tunately, even well controlled studies 
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involving direct suggestion, which 
focus exclusively on the phenotype, 
have limited theoretical and empirical 
significance. The studies that he cites 
to support direct suggestion lack the 
proper controls for separating the 
effects of hypnotic suggestion and the 
demand characteristics of the re- 
search design from the observed be- 
havior. Barber (1961), Orne (1959), 
and Sutcliffe (1961) have firmly 
established the need for the proper 
control groups, despite Levitt's com- 
ment that it is “patently nonsensi- 
cal" to do so. In the absence of these 
controls, the investigator is left 
with a contaminated phenotype that 
has little, if any, significance in re- 
gard to etiology. This is perhaps the 
most important reason for the almost 
zero impact that instrumental hyp- 
notic research has had upon the bio- 
logical and behavioral sciences. 
There is no quarrel with Levitt's 
high regard for scientific method or 
his desire to control experimentally 
emotional states. We can only insist 
that he incorporate a control group 
composed of subjects who, unknown 
to the experimenter, are requested to 
fake hypnosis. Nevertheless, the 
result would be a phenotype which is 
an amalgam of the suggested behavior 
and whatever other processes are 
activated by the suggestions. 
Clinical observation and controlled 
experiment are both legitimate and 
distinctive scientific methods, but it 
is not always possible to generalize 
directly from the results of the latter 
to clinical or natural phenomena. 
The paradigm in question can be 
viewed as an attempt to bring the 
data of clinical and experimental 
methods closer into congruence. 
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IMMEDIATE MEMORY IN SEQUENTIAL TASKS" 


MICHAEL I. POSNER ? 
University of Michigan 


Factors effecting the memory capacity are basic to understanding sequen- 
tial tasks. The evidence indicates immediate memory is sometimes subject 
to decay, but that interference from interpolated items has a much larger 
effect. Interference effects are particularly great when the S must hold 
items in store while responding to previously stored material within an 
ongoing serial task, The ability of S to use time to reorganize the stimuli 
for storage works against the decay tendency. Only in rare instances does 
S store a pure representation of the stimulus; rather he must be viewed 
as an active information handler applying his knowledge of the nature 


of the stimulus and response to reduce his memory load. 


A large number of psychological tasks 
involve the presentation of sequential 
information to the subject (S). In these 
tasks S is required to receive, process, 
and store data for representation in the 
response. If the task requires nearly 
complete representation of the stimulus 
in the response, then as the number of 
stimuli to be so included increase or 
as the delay between stimulus and re- 
sponse gets longer, the memory require- 
ment becomes a limiting factor in the 
correct completion of the task. 

The study of immediate memory be- 
gan with the classic memory span ex- 
periment (Jacobs, 1887) in which a 
series of simple stimuli are presented to 
S a single time, and then reproduced 
immediately by him. In recent years 
immediate memory has been investigated 
as an integral part of the whole question 
of retention in living organisms. Thus 
the traditional operations of the memory 
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span experiment, recall after a single 
presentation. (Blankenship, 1938), are 
no longer adequate to include the newer 
studies in this area. In the studies which 
will be considered in this paper the cru- 
cial common factor appears to be that 
S holds the material in store for use in 
a single response within the context of 
the serial task in which he is engaged. 
'The S does not attempt to make the ma- 
terial available for repeated responses; 
indeed, it may be of crucial importance 
to discard the information immediately 
after it is used. 

The first two sections of the paper deal 
with studies concerned with the effects 
of decay and interference factors on the 
immediate memory capacity. In the 
third section factors which effect the 
organization of material as it is placed 
in store are considered. 


TIME IN STORE 
Two major methods are used to 


manipulate the time an item is in store. - 


One is to vary the time between succes- 
sive stimuli in a serial task. The other 
is to present the stimulus and then vary 
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the delay before allowing a response. 
The results of these methods are con- 
sidered in this section. 


Decreasing Rate of Presentation 
Increased Recall 


One of the oldest fundamental prob- 
lems of immediate memory is the change 
in recall with the interval between suc- 
cessive stimuli (interstimulus interval). 
Bergstrom (1907) found a uniform de- 
cline in errors of immediate reproduc- 
tion with increasing time intervals from 
-5 to 2 seconds. Bergstrom concluded 
that, at least for words, the percentage 
of correct responses varies as the log 
of the interstimulus interval. Two more 
recent studies using different techniques 
(Guthrie, 1933; McReynolds & Acker, 
1959) have confirmed Bergstrom's gen- 
eral result. 

Pollack (1952) presented Ss with mix- 
tures of digits and consonants at inter- 
stimulus intervals ranging from .31 to 4 
seconds. He reports a decrease in the 
percentage of series correctly recalled 
with reduction in interstimulus interval. 
There was, however, some indication of a 
reversal in this general rule in the inter- 
val from .62 to .31 second. In this re- 
gion, with stimuli from two to five bits 
per item, the performance showed a 
slight rise with increased speed. Pollack 
attributes his basic result to time for re- 
hearsal and reorganization between 
items. This factor may be eliminated 
at the fastest rates, particularly with 
high information in the stimuli. In an- 
other study (Pollack, Johnson, & Knaff, 
1959), using a different technique, the 
same general relationship was found. 
This study presented strings of digits 
from 25 to 40 units in length. The Ss 
in different groups were either informed 
or not informed of the series length, but 
in both cases were required to present 
Írom memory the largest number of suc- 

cessive adjacent digits ending with the 
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terminal point. The study was con- 
ducted at intervals of .25, .5, 1, and 2 
seconds. The results showed that sig- 
nificantly more items are retained at the 
slower speeds than at the faster speeds. 
Also, as the rate of presentation of the 
series increased, differences between 
groups which knew when the series would 
stop and groups without such informa- 
tion were reduced. 


Decreased Recall 


A decay theory would suggest that 
under some conditions, where perceptual 
organization and rehearsal were not im- 
portant, increasing the rate of presenta- 
tion would decrease the time in store 
and improve recall. Some studies of the 
memory span type, with very rigid pac- 
ing of presentation and recall to control 
rehearsal and with simple stimuli to re- 
duce organizational factors, have shown 
this effect. 

The earliest attempt to control com- 
pletely organization and rehearsal in an 
immediate memory study  (Fraisse, 
1937) used simple click sounds as the 
stimuli. The Ss received from 3 to 10 
of these sounds and then were required 
to tap them back in a rhythm reproduc- 
ing their presentation and to obtain the 
correct number of clicks. Obtaining the 
correct number would, of course, be 
trivial if Ss could count the number 
of clicks which were presented to them. 
To prevent this, Ss were required to 
perform a simple task such as counting 
backwards which interfered with the 
counting of the clicks, but presumably 
not with hearing them. The extent to 
which this device and the instructions 
were successful can only be demon- 
strated by the results of the study. As 
the interstimulus interval increased, the 
number of sounds for which 50% correct 
reproduction was possible fell. At an in- 
terval of .17 second the average was 5.7; 
and this decreased regularly until at an 
interval of 1.8 seconds only 3.3 sounds 
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were correctly reproduced 50% of the 
time, It can be argued, however, that 
this study confounds increasing time in 
store with amount of interference from 
the second task. 

Recently, studies (Conrad, 1958; 
Conrad & Hille, 1958; Fraser, 1958) 
using the more conventional stimuli of 
digits and not involving a second dis- 
tracting task have shed some light on 
this problem. Conrad and Hille (1958) 
presented series of eight digits to Ss at 
intervals between stimuli of 2 seconds 
and .67 second. There were three condi- 
tions of recall: paced at 2-second in- 
terval, paced at .67-second interval, and 
unpaced. In paced recall, Ss were re- 
quired to write their responses in order 
and in time to the beating of a mechani- 
cal device. In the unpaced conditions, no 
timing was done, so that in these condi- 
tions the length of time a given item was 
in store could not be determined. The 
results show that with paced recall the 
percentage of correct responses declines 
significantly with increasing mean time 
in store. The unpaced recall situation, 
for which no mean time could be calcu- 
lated, is superior to all types of paced 
recall, This seems to indicate that when 
rehearsal is controlled by pacing both 
stimuli and responses, recall is a func- 
tion of mean delay between stimulus 
and response. Conrad (1958) and Fraser 
(1958) have confirmed this essential re- 
sult. 

In general, decreasing the rate of 
presentation allows S more time to or- 
ganize, perceive, and rehearse the ma- 
terial with which he is presented and 
thus in many, if not most, situations it 
results in increasing recall. In some cases, 
however, when the ability to organize 
and rehearse is limited because of strict 
control over the time which S has, in- 
creasing the rate of presentation de- 
creases the time in store and seems to 
cause an increase in the ability to recall 
past stimuli. 


Memory Span 

McLane and Hoag (1943) presented 
lists of six nonsense syllables to a group 
of Ss in the classroom situation. After 
the presentation, varying amounts of 
time from zero to 180 seconds were 
allowed to elapse, during which Ss were 
instructed not to rehearse the material. 
On the experimenter's signal Ss were 
required to record their responses in the 
order in which the stimuli had been pre- 
sented. The results did not show any 
systematic decline in recall with time. 
Instead, a cyclic pattern was apparent in 
the group curves. The authors state that 
the group curves were also faithful rep- 
resentations of the data from five Ss 
tested individually with syllables, words, 
and forms. Anderson (1960), however, 
found that with 12 digits, recall perform- 
ance declines from 72 to 60% over a 30- 
second gap even when Ss are à 


Recall of Visual Patterns 

It is important to note that with 
figures, unlike the unidimensional stimuli 
discussed above, recall of a single stimu- 
lus-response combination can be mean- 
ingfully investigated. 
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Much of the long history of memory 
for complex forms showing systematic 
changes in recall in accord with the laws 
of gestalt organization has been severely 
criticized (Hanawalt, 1937) since the 
method of recall involved S reproduc- 
ing the figure and often made use of 
successive reproductions over time. Since 
1937 some investigators have continued 
to find systematic changes of the gestalt 
type over short periods of time (Brown, 
1956; Walker & Veroff, 1956). However, 
a long series of studies using methods of 
recognition, with progressive improve- 
ment in technique (Carlson & Duncan, 
1955; George, 1952; Hebb & Foord, 
1945; Lovibond, 1958), have shown 
either no consistent changes of the trace 
with time or changes which do not con- 
form to any principles of good organiza- 
tion. This same result was found by 
Crumbaugh (1954) using a method of 
paired comparisons. In general, it would 
appear that immediate memory for fig- 
ures shows some decline in accuracy over 
time, but the decline is not necessarily in 
the direction of a normative figure such 
as is suggested by the gestalt concepts. 


Psychophysics 


In the psychophysical method of 
paired comparisons it is possible to vary 
the interval between standard and com- 
parison and thus to study time factors in 
memory. While in this case memory is 
measured by judgment rather than 
recall, one study (Parducci, 1954) has 
shown that several typical memory fac- 
tors apply. It should be pointed out that 
S never considers only the standard 
and comparison in his judgment (Hel- 
son, 1948; Woodworth, 1938), so that it 
is not usually possible to extract com- 
pletely time effects from the particular 
order of the stimuli used (Woodworth 
& Schlosberg, 1954). It is, however, 
instructive to review work in this area, 

Much work in psychophysics (Fech- 
ner, 1860; Kohler, 1923; Postman, 
1946) shows a negative time error, that 


MICHAEL I. POSNER 


is, the second of two stimuli, when com- 
pared with the first, is overestimated. 
Stevens (1957) suggested that this time 
error is usually an artifact of the range 
effect; this criticism, however, would 
not seem to apply to studies in which 
the interstimulus interval is actually 
manipulated as an independent variable 
(Harris, 1952; Koester, 1945). 

Koester (1945) helped to delineate 
some of the determiners of the time 
error. He found that time errors in loud- 
ness judgments are greatly affected by 
the stimulus level, amount of practice, 
and particular experimental arrange- 
ment. He also distinguished between 
constant errors (consistent direction) 
which were found only in intensive 
judgments, and variable errors which he 
found in both pitch and loudness judg- 
ments. He found that as the time be- 
tween paired stimuli increases from zero 
to 10 seconds there is an increase in the 
variance of the judgments. In the case 
of a series of stimuli presented without 
standard this same effect was found up to 
47 seconds. These results might be in- 
terpreted as a form of forgetting which 
sometimes occurs in a consistent direc- 
tion and in those cases leads to a con- 
stant time error. This lowering of sensi- 
tivity with time was in agreement with a 
study at the turn of the century, with 
judgments of clangs (Angell & Harwood, 
1899), but Postman's data (1946) 
showed no such change in variance with 
time. Data by Irwin (1937) showed an 
increase in pitch sensitivity with longer 
time interval, 

Much of this conflict is reconciled in 
another analysis of change in sensitivity 
of pitch with time, Harris (1952) found 
a consistent decrement in sensitivity with 
time, The study explored the use of a 
fixed standard, that is, a standard which 
was repeated many times and a roving or 
changing standard which was not re- 
peated in different paired comparisons. 
With the fixed standard there was no 
decrement up to 3.5 seconds delay; with 
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the roving standard, however, there was 
a steep decline after only 1 second. This 
demonstrates a clear difference between 
short-term memory for a standard that 
has been repeated only a single time 
and a standard which has been repeated 
a large number of times, In another 
study (Bachem, 1954) also involving 
pitch judgments, the time interval be- 
tween standard and comparison was 
varied from 1 second to 1 week. Con- 
sistent differences in sensitivity were 
found all the way out to the longest in- 
terval used. 

The results of time delay in paired 
judgments do seem to indicate that over 
small periods of time there is a decrease 
in the ability of S to match the standard 
and comparison stimuli, particularly if 
the standard is not repeated. Such a 
change in sensitivity would seem analo- 
gous to a decay factor in memory span 
studies. 


INTERPOLATED ACTIVITY 


Investigations interested in the effects 
of a gap in time or of interpolated ma- 
terial between stimulus and response in 
immediate memory have all attempted 
to prevent rehearsal of the relevant ma- 
terial during the gap. We have discussed 
the simplest attempts to prevent such re- 
hearsal through asking S not to think 
of the stimulus material during the gap. 
At a more complex level of control, the 
experimenter interpolates material to fill 
the gap and studies the effect of the 
material and of the gap itself on recall. 
Another design is to fill the gap with 
material relevant to the ongoing serial 
task itself, This method is of particular 
interest because it is close to the serial 
activity of everyday life and has, there- 
fore, implications for skilled perform- 
ance. These two methods will be con- 
sidered in this section. 


Interpolation of Irrelevant Material 


A study by Pillsbury and Sylvester 
(1940) was aimed at discovering the 
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effect of similarity of material in retro- 
active and proactive inhibition in im- 
mediate memory. The Ss were exposed 
either to series of six pictures, nonsense 
syllables, or words, The Ss engaged in 
activity intervening between presenta- 
tion and recall showed a decrement in re- 
call scores from 24 to 4396 over the con- 
trol groups. Other experiments in this 
series showed that the amount of the 
decrement was more closely related to 
the overall difficulty of the interpolated 
task than to its similarity to the recalled 
material. 

More recently, the interpolation of 
irrelevant material has been studied by 
Brown (1958). Brown's results are in 
agreement with those of Pillsbury and 
Sylvester in showing that a delay of 
several seconds is sufficient for forgetting 
when rehearsal is prevented, even when 
the amount of material in store is within 
the memory span. Brown did not find 
similarity to be an important factor and 
accordingly interpreted his findings as 
supporting a decay theory of immediate 
memory. His results show that proactive 
presentation causes à small decline in 
performance, while retroactive presenta- 
tion causes a great decline and that there 
is little difference between similar and 
dissimilar material. In the final study of 
the series, the time interval between 
stimuli to be recalled and the irrelevant 
additional stimuli was varied. In agree- 
ment with the idea of perseverative con- 
solidation (Müller & Pilzecker, 1900), 
the retention scores increased as the free 
gap grew from .78 to 4.68 seconds. Re- 
call increased at a negatively accelerated 
rate with increasing time for consolida- 
tion prior to the introduction of the ir- 
relevant material. 

Another recent study (Peterson & 
Peterson, 1959) confirms and adds to 
the general notion that immediate mem- 
ory weakens rapidly when an intervening 
activity, even somewhat unrelated, is 
used. In this study, a single three-letter 
trigram was presented as the stimulus 
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to be recalled, while the interpolated ac- 
tivity consisted of counting backwards 
by three or four from a fixed three-digit 
number. Intervals from 3 to 18 seconds 
delay were used. The percentage of suc- 
cess varied from .8 with 3 seconds of 
interpolated activity to .1 with the full 
18-second period. In an attempt to un- 
derstand the relation between immedi- 
ate and long-term memory, Ss were al- 
lowed to rehearse the required material 
either orally or silently before the inter- 
polated number was presented, Increas- 
ing the time for oral rehearsal from zero 
to 3 seconds led to increased recall 
scores, but with silent rehearsal no sig- 
nificant improvement occurred. Conrad 
(1960b), using lists of seven and eight 
digits, found an interruption of less than 
1 second, just sufficient to say the digit 
zero cuts the percentage of correct re- 
calls of the list in half. This interrup- 
tion can occur at any time during a 10- 
second period of silent rehearsal with 
about equal effect, Thus, for amounts of 
material near the span, silent rehearsal 
appears to be of no use in helping S 
retain the series through even a brief 
interruption, 

The seemingly conflicting data of 
Brown, Peterson, and Peterson, and 
Conrad makes more sense when an 
analysis is made of the amount of mate- 
rial to be recalled. Brown (1958) used 
a pair of consonants and got consider- 
able consolidation. Peterson and Peter- 
son (1959) used three unrelated let- 
ters and found overt repetition now 
necessary to get a significant consolida- 
tion effect. Murdock (1961) points out 
that the number of chunks rather than 
the number of items is of importance for 
this effect. Murdock repeated the Peter- 
son experiment using a single meaning- 
ful word and found that the effect of 
interruption was much less, but when 
he used three words the results were 

virtual duplicates of the Peterson results. 
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Thus what appears to be impo 
the number of organized units 
must be recalled. In the Conrad (19 


just within the memory span, and tl 
no consolidation effect appears and e 
momentary interruption causes à 
decrement, Sanders (1961) also 
sented lists of eight digits in a 
which sought to explore systematic: 
the effect of rehearsal on the ability 
recall after interpolated material. 
ers, however, had his Ss group the 
into four chunks and found a very si 
icant effect of rehearsal. Futher s 
are needed to determine the pi 
relationship between amount of mate 
and degree of consolidation. 

At least two recent findings pose. 
problem for this view of rehearsal 
ing to prevent decay of the trace with 
time, Murdock (1961) presented a run- | 
ning memory task in which S had 
to report the first and last three words 
of a series of unknown length. He varied 
the number of items in the list and the 
rate of presentation. He found the 
number of interpolated items to have a — 
very significant effect on the recall of the 
first word, but rate had no effect. Thus : 
for a given number of items there was 
no significant effect of varying the overall 
time in store by a factor of four through 
changing the time per item. Peterson 
(1961) finds another phenomenon diffi- 
cult for a decay notion. He presented 
a nonsense syllable twice, separated by 
an interval varying from 1 to 11 seconds. 
This interval was filled by counting | 
backwards and the two presentations - 
were followed by counting backwards 
for 6 more seconds before recall of the 
syllable, The results show a small but 
significant improvement in recall with - 
increasing time between the two pres- 
entations. This is the opposite of what 
might be expected from a simple decay - 
notion. 
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Interpolation oj Relevant Material 
Sequential Presentation 

In a practical sense perhaps the most 
important use of immediate memory is 
when the serial activity itself intervenes 
between stimulus and response. The 
simplest example of this technique is 
what Woodworth (1938) calls the 
method of retained members, A simple 
memory span is converted to retained 
members by enlarging the number of 
stimuli presented in excess of the span 
and requiring S to attempt to repro- 
duce the material presented from the 
beginning. In a study of this method 
(Gates, 1916), the memory span of Ss 
fell from eight to six digits when more 
than eight digits were presented. Another 
variation of this method has been studied 
by Anderson (1960) and called post- 
stimulus cuing. She required Ss to listen 
to three groups of four digits each and 
then varied the time prior to instructing 
S as to which of the three groups he was 
to recall. This differs from retained 
members by allowing selectivity of re- 
call. One result of this technique is to 
show that part of the loss in reten- 
tion is associated with the actual record- 
ing of the items at the time of recall and 
that this becomes relatively less impor- 
tant as the retention interval increases. 

In a serial task it is often necessary 
to leapfrog memory, that is, to memo- 
rize new material while retaining the old 
which is itself to be recalled within the 
context of the task. Kay and Poulton 
(1951) found in this situation that the 
retention of items for later recall im- 
paired prior recall performance. Poulton 
(1953) found his Ss unable to recall the 
previously learned items and assimilate 
simultaneously new stimuli when rapid 
rates of presentation were used. In this 
situation both recall and assimilation of 
the new material suffered. The authors 
interpret these results as showing that 
recall is an active process which can in- 


all the material coming in to him, a cer- 
tain amount must be stored prior to 
active reorganization. > 
A series of studies of immediate 
memory have involved simultaneous 
tation of stimuli (Broadbent, 

1954, 1957; Brown, 1954). In an early 
such study (Brown, 1954), Ss were 
presented with arrows and numbers 
simultaneously. After receiving all of the 
stimuli Ss were required to recall all or 
part of what they had seen. If S had to 
recall the arrow list followed by the 
number list, he did significantly more 
poorly on the numbers than if he recalled 
numbers alone. This is similar to the 
studies on gaps filled with irrelevant ma- 
terial, since the prior recall of arrows 
reduced the number score. If S recalled 
the numbers and then the arrows, his 
number score was also worse than recall 
of them alone. Thus, the mere retention 
of arrows while recalling the numbers re- 
duced the score. This is confirmation of 
the work previously mentioned (Kay & 
Poulton, 1951; Poulton, 1953), which 
showed retention to be an active process. 
Broadbent (1954) developed a tech- 
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nique for presenting different digits si- 
multaneously to both ears in a binaural 
presentation. With binaural presentation 
at fast rates, two digits per second, Ss 
usually delay the digit presented to one 
ear until after recalling all digits pre- 
sented to the other ear. Using the 
binaural technique, Broadbent (1957) 
proceeded to analyze the effects of time 
delay and interpolated material on 
immediate memory. Six digits were pre- 
sented to the right ear and two more 
to the left ear. Interstimulus intervals 
of .5 second and 1 second were used. 
The results of this study are quite con- 
sistent with the results of the sequential 
presentation studies, The study shows 
that mere time in store produces a 
decrement in recall, but the interpolation 
of irrelevant material produces a much 
larger decline in recall Scores. 

Moray (1960) made a more com- 
plete analysis of immediate memory 
using this procedure. He compared 
Broadbent's method, in which both digits 
Were presented at precisely the same 
time, one to each ear, and a procedure 
in which the digit to the left ear slightly 
preceded the digit to the right ear, leav- 
ing the overall rate the same, In the 
former condition he confirmed Broad- 
bent's finding that at fast rates Ss were 
better able to report all the digits on one 
ear and then all the digits on the other 
ear. With the latter procedure, however, 
he found that there was no difference 
whether S reported the digits interleafed 
or on alternate ears. Subsequently, it 
was found (Broadbent & Gregory, 1961) 
that when the digits were presented by 
either method to eye and ear it was im- 
possible for Ss to interleaf at fast rates 
(two per second or greater). Gray and 
Wedderburn (1960), however, found 
that if Ss could hear an actual word or a 
complete sentence when they followed 
the strategy of interleafing ears, it was 
possible to overcome the grouping by 

ear. 
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Span of Perception 


Another way in which simultaneous 
presentation can be investigated involves 
S’s reports when a single exposure com- 
posed of a number of items or dimen- 
Sions is presented. These studies on the 
span of perception (Miller, 1956; Wood- 
worth & Schlosberg, 1954) are clearly 
related to those on memory span. 
Recently, Miller (1956) has pointed out 
that the span of perception has roughly 
the same value as the normal mem- 
ory span, about seven items. This rela- 
tionship has been suggested even more 
strongly by Lawrence and LaBerge 
(1956). Stimuli consisting of two cards, 
each containing one of four colors, one 
of four types of figure, and from one 
to four figures were exposed to Ss for 
.1 second. In the various conditions Ss 
were instructed to recall all or part of 
the material presented. It was found that 
if Ss achieve superior performance on 
one dimension, they show a correspond- 
ing decrement on the other dimensions. 
Of more importance, when S was in- 
structed after reception on which dimen- 
sion was to be first recorded, he had just 
as good a score on that dimension and as 
large a decrement on the other dimen- 
sions as if he had been instructed be- 
fore the presentation to concentrate on 
one aspect. 'This result shows the close 
link between perception and immediate 
memory, for the effect of instruction to 
emphasize one aspect in viewing the ex- 
posure was not greater than could be 
accounted for by the fact that the 
emphasized dimension was recalled first. 
However, Brown (1960) found that un- 
der certain conditions an increment in 
performance can be due to selective per- 
ception, which is over and above that 
due to recall, 

Two extensive experiments (Averbach 
& Coriell, 1961; Sperling, 1960) have 
explored the memory function resulting 
from short exposures of complex visual 
stimuli by the method of partial report, 
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which is analogous to the method of 
poststimulus cuing (Anderson, 1960) for 
sequential stimulation. Sperling found, 
using a wide variety of letter displays, 
that the immediate memory span ranged 
from 3.8 to 5.2 letters, with an average 
of 4.3 letters. This span was remarkably 
consistent for each S. Sperling then used 
a tone to cue S with regard to what 
part of the stimulus information he was 
to report. Sperling weighted the percent- 
age of items correctly reported from the 
randomly selected partial report by the 
number of equiprobable nonoverlapping 
reports in order to estimate the total 
amount in store at the start of the recall. 
Using this technique he found that there 
was for all Ss a much larger amount of 
information available immediately after 
the stimulation than could be reported 
with the whole report technique. While 
the immediate memory span results 
showed remarkable consistency regard- 
less of the amount of input information, 
the partial report method revealed that 
the information available immediately 
after stimulation grew virtually as a 
linear function of the amount of input 
information, Roughly twice the number 
of items were available to S through 
the partial report technique than could 
be discovered by the whole report. Sper- 
ling found a rapid decay of the informa- 
tion available with the greatest loss 
occurring within the first .25 second. 
Sperling also confirmed work (Kay & 
Poulton, 1951; Poulton, 1953) showing 
that retaining in memory something for 
later report interferes with the ability 
to recall earlier material. He compared 
the first part of a whole report with a 
partial report including the same infor- 
mation and found that with partial report 
the accuracy was 90%; with whole re- 
port the accuracy fell to 69%. Averbach 
and Coriell (1961) found that recall 
varied as a function of mean delay be- 
tween stimulus and cue from 70% cor- 
rect at zero time delay to about 35% 
correct at .2-second time delay. It is in- 
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teresting to note that even with advance 
warning of .1 second as to which letter 
would have to be recalled, Ss did not 
reach 10096 performance; neither did 
the final level appear to be approaching 
zero. This means that some of the mate- 
rial must reach a more permanent store. 
Taken together these two experiments 
seem to indicate a rapidly decaying im- 
mediate memory system peripheral to a 
limited capacity perceptual system. This 
is in agreement with the interpretation 
that Broadbent (1958) makes of his ex- 
periments in simultaneous auditory and 
in auditory visual presentations. 


Variable Number of Stimuli between 
Stimulus-Response Combinations 


Some studies have required S to re- 
spond to stimuli after a variable number 
of subsequent presentations. In these 
studies S must hold in storage items 
which will be recalled at a future time 
while responding to earlier items. In this 
situation memory span is found to be 
very much reduced. A study of this type 
by Kay (1953) found that with zero 
stimuli in store, Ss responded 100% cor- 
rectly; with one item in store, this fell to 
90%; with two, 50%; and with three, 
15%. In a more recent study of this 
same type of task (Mackworth, 1959), 
it was found that Ss are unable to store 
more than four items in memory and 
still respond to a criterion of 8075 cor- 
rect, regardless of the rate of presen- 
tation used. To a considerable degree 
a decrease in rate could be traded for 
additional items in storage, but only 
below the limit of four items. This same 
task has been used (Kirchner, 1958) to 
demonstrate a large decrement in mem- 
ory span with increasing age. 

Several experiments (Lloyd, 1961; 
Lolyd, Reid, & Feallock, 1960; Reid, 
Lloyd, Brackett, & Hawkins, 1961) 
used the technique of poststimulus cuing 
within the context of a continous sequen- 
tial task. Lloyd et al. presented Ss with 
items falling into a fixed number of 
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generic classes, as, for example, elm and 
willow under tree, Berlin and Moscow 
under city, etc. The S listened to several 
of these and then was given the generic 
name as a cue for recall of the item or 
items of that class which was held in 
store. The authors find that in a variety 
of conditions as the average storage load 
increases the number of errors uniformly 
increase. The average storage load is a 
variable which summarizes the overall 
burden on the immediate memory store 
and which is correlated with both the 
number of items and the delay between 
the presentation of a stimulus and its 
recall. The authors find that variables 
concerned with the number of possible 
items that can occur, that is, the number 
of classes or the number of items per 
class, are not very important in account- 
ing for overall performance. 

Another experiment (Yntema & 
Mueser, 1960) used a very similar 
technique in which Ss were required to 
recall either the states of a number of 
attributes of a single object or one attri- 
bute of a number of objects. The study 
is of particular interest because it yields 
a plot of correct recalls as a function of 
the number of variables and as a func- 
tion of the number of intervening stimuli 
and responses. When S is required to 
store eight attributes of the same object 
(by far the superior of the two condi- 
tions), 5 interpolations reduce the proba- 
bility of recall from 1 to about .8, with 
15 interpolations it declines to .65 and 
does not drop below this even up to 35 
interpolations, While interpolation of 
items consistently degrades performance 
and greatly reduces the memory span, it 
does not prevent retention considerably 
above chance levels. This is in agree- 
ment with studies by Lloyd (1961), 
Averbach and Coriell (1961), and 

Shepard and Teghtsoonian (1961). 


ORGANIZATIONAL Factors IN 
IMMEDIATE RECALL 


The first two sections of this paper 
have been concerned with the storage 


time and interference effects which result 
from the nature of the sequential task 
used. Even in these sections it has been 
apparent that one of the key factors 
in immediate memory is the ability of 
S to organize the material. Despite the 
fact that both decay and interference 
theories would predict the less time in 
Store the better the recall, the opposite 
is usually true in sequential tasks, How 
S actually does go about placing the re- 
ceived material in store is vitally impor- 
tant in how much is retained, which 
items he recalls, and the type of errors 
he makes. In this section experiments 
will be considered which emphasize the 
role of material, type of response, in- 
structions and strategy in determining 
these dependent measures. 


Amount of Material Retained 
Input Uncertainty 


The idea that the uncertainty of the 
input affects the number of items 
retained in immediate memory stems 
from the work of Jacobs (1887). He 
accounted for the observed recall of 
more digits than letters by suggesting 
that the number of alternatives from 
which a stimulus is chosen is a relevant 
factor in the number which can be 
retained, 

By the use of information theory 
(Shannon & Weaver, 1949), the differ- 
ences in material due to the size of the 
alphabet or aggregate from which they 
are selected can be consistently qualified. 
When the amount of information in 
the stimuli is varied by changing the 
number of equiprobable alternatives 
from which they are selected (Miller, 
1956; Pollack, 1953), it is found that 
the number of items which can be 
retained in immediate memory is rela- 
tively independent of the amount of 
information per item, As the information 
per symbol increases, the number of 
items which can be held in store 
decreases slightly, but not nearly so 
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fast as the information per item 
increases. 

There are other ways to vary the 
information in the input besides varying 
the number of alternatives from which 
the stimuli are selected. Two other ways 
are to vary the probability of selection 
of a given stimulus or to place con- 
straints on the sequences of stimuli 
which are chosen. It had long been real- 
ized (Ebbinghaus, 1913) that memory 
for meaningful material is much greater 
than for nonsense material. Information 
theory was used to explore differences 
in material due to unequal probabilities 
and sequential constraints (Aborn & 
Rubenstein, 1952; Marks & Jack, 1952; 
Miller & Selfridge, 1950; Rubenstein & 
Aborn, 1954). Miller and Selfridge used 
seven orders of approximation to English 
and several lengths of lists. They found 
that the percentage recalled decreased 
with the length of passage and increased 
with organization through the lowest 
three orders of approximation with short 
lists, and through the first five orders 
with longer lists. The results, since repli- 
cated (Richardson & Voss, 1960), 
indicate an interaction between length 
and degree of organization, showing that 
the higher order approximations yield 
more effective learning only as the lists 
become long. A follow-up study using 
essentially the same method but requir- 
ing correct order of recall (Marks & 
Jack, 1952) found that the improvement 
in percentage recalled extended with or- 
ganization through all seven degrees 
used. Lawson (1961) extended this 
method into another type of task and 
found that the eye-voice span (number 
of words by which the eye leads the 
voice) increased consistently from zero 
to twelfth order, although the total 
amount of the increase was only slightly 
over one word. It is clear, however, 
from both studies that the increase in 
the number of items recalled with greater 
organization was not equal to the de- 
crease in the information per item. 
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This was further investigated by 
Aborn and Rubenstein (1952) and 
Rubenstein and Aborn (1954) who used 
nonsense syllables and various rules gov- 
erning the sequential order, so that the 
information conveyed per syllable was 
from 1.5 to 4 bits, The results show 
quite clearly that the number of items 
recalled increases with the degree of or- 
ganization of the material, but not as fast 
as the bits per syllable decrease. Thus, 
in contradiction of the hypothesis with 
which they started, the authors find 
that more information can be retained 
the less the organization of the material, 
These results are in complete agreement 
with those studies varying information 
through the use of different size alpha- 
bets (Miller, 1956; Pollack, 1953). An- 
other study (Miller, 1958) varied the 
information content in strings of letters, 
but without informing Ss about the 
statistical nature of the strings used. His 
finding is in agreement with the previous 
studies. Hogan (1961) used similar ma- 
terials in a copying task and found that 
the number of looks to copy a fixed mes- 
sage declined with redundancy. As prac- 
tice increased Ss made more use of the 
redundancy, but never reached the level 
required by the constant information hy- 
pothesis. 


Output Uncertainty 

Studies which reduce the limitations 
of immediate memory by providing in- 
formation about the stimuli in advance 
of their presentation (redundancy) have 
been considered. This section will con- 
sider studies which require S to hold 
in store only a part of the information 
in the stimulus presentation. 

When complex materials are used as 
stimuli in sequential tasks involving 
memory, the response is often changed 
from reproduction to recognition. Ina 
task comparing recognition, recall, and 
naming of complex figures (Anderson & 
Leonard, 1958), it was concluded that 
the advantage of random figures in a 
recognition task results from S’s abil- 
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ity to make his discriminations on the 
basis of distinctive parts of each figure. 
Considering recognition of simple uni- 
dimensional stimuli, there seems to be 
little difference between recognition and 
complete recall since, in either case, the 
single available aspect must be retained 
to complete the task. Shepard and 
Teghtsoonian (1961) provide a good 
deal of evidence on retention in a recog- 
nition task. They studied recognition 
within the context of a continuous se- 
quential task in which S is presented 
with successive three-digit numbers and 
is required to say if he has seen them 
previously in the list. The probability of 
S correctly recognizing a number seen 
previously declines consistently with the 
number of interpolated stimuli. With 
zero interpolation, S is 100% correct; 
this declines rapidly to 80% with 5 inter- 
polations, and then levels off so that af ter 
20 interpolations 7095 are correct, and 
with 60 interpolations 55% are correctly 
recognized. Thus even with 60 inter- 
polations recognition is above chance. 
These results are not unlike those found 
by Yntema and Mueser (1960) for re- 
call, but the situations are too different 
for accurate comparison. Shepard esti- 
mates the amount of information being 
retained by S at 32 bits which is quite 
close to the information retained as 
measured by the memory span for digits 
and for letters (Brener, 1940; Jacobs, 
1887; Pollack, 1953). The comparison 
of recognition and recall is done more 
precisely in a recent study (Davis, 
Sutherland, & Judd, 1961) in which pairs 
of letters were used as stimuli, Recog- 
nition was superior as long as the num- 
ber of alternatives from which the recog- 
nition was made was below the total 
number of possible combinations; when 
the alternatives equaled the number of 
possible pairs, recall and recognition 
were not different. However, as the 
stimuli become more complex and the 
number of stimulus dimensions increase, 
the possible use of selective remembering 


in the task increases. So that with com- 
plex stimuli (Strong, 1912), a large num- 
ber can be recognized but only a few 
could be completely recalled. 

Pollack (1959) has independently 
varied the number of alternatives from 
which a message (single word in noise) 
could be drawn and the number of al- 
ternative responses which are later pro- 
vided to S. The Ss listen to words em- 
bedded in noise and after presentation 
are given a list of response alternatives 
less than the original number of possible 
words from which the message was 
drawn. Pollack found that the accuracy 
of recall is independent of the message 
uncertainty but varies inversely with 
response uncertainty, He next compared 
two types of memory strategy for his S5: 
categorization, in which S decides on the 
word at the time of perception and 
stores only his decision; and representa- 
tion, in which S stores the stimulus and 
does not decide on a response until the 
response categories are presented. Since 
the message uncertainty is of little im- 
portance, it appears that S is placing 
heavy emphasis on the representation 
Strategy. To check on this Pollack left 
a time delay between presentation of 
the word and S's recall. He found a 
consistent but small decrement in recall 
accuracy, which would tend to indicate 
at least partial use of representation. 

Pollack's conclusion that, in recogni- 
tion tasks, categorization may occur 
after a period of storage is very im- 
portant for work attempting to locate 
the place of short-term storage in the se- 
quence of human information handling. 
It seems reasonable from this work, 
Írom psychophysical studies and studies 
of memory for complex forms, that 5 can 
Store a representation of the stimulus for 
later categorization, The likelihood of 
this strategy is greatest when the stimu- 
lus input is complex as in a tachisto- 
scopic presentation (Averbach & Coriell, 
1961; Sperling, 1960), or when more 
than one channel is used (Broadbent, 
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1958; Broadbent & Gregory, 1961), or 
when the stimulus is difficult to code in 
verbal form (Harris, 1952; Koester, 
1945), or when information about the 
relevant response categories is delayed 
(Pollack, 1959). In these cases the op- 
portunity for demonstrating a decay 
factor is increased. This does not mean 
that such an order of processing is al- 
ways indicated, in fact, quite the con- 
trary is the case; in general S attempts 
to perceive, categorize, and reorganize 
the material prior to storage. Thus it 
appears, as Broadbent (1958) has sug- 
gested, that the perceptual system may 
be, in some cases, preceded by a short- 
term store, but that material may also 
be circulated back to the store after 
perceptual organization. In the former 
case, evidence of decay factors is much 
stronger since a representational type 
of storage must be involved. The rela- 
tionship between representational and 
category memory appears to capture dif- 
ferences between a pure memory factor 
and a situation in which memory limita- 
tions have been reduced by reorganiza- 
tion. This is also supported by Mack- 
worth’s (1959) finding that verbal labels 
were of no help over reliance on spatial 
cues with little interference, but became 
of increasing importance as the storage 
load grew. 
Grouping 

It has long been realized (Gill & Dal- 
lenbach, 1926; Oberly, 1928) that Ss 
group items within the memory span. In 
one study (Gill & Dallenbach, 1926), Ss 
reported that the groupings were pro- 
vided by accenting of the presentation, 
but since Ss used different size groups, it 
was concluded that grouping depends on 
S, not on the presentation. In an analy- 
sis of grouping (Fraisse, 1945) within 
the memory span, groups were defined as 
a cluster of correct items separated by an 
error. Using this definition, with lists of 
10 digits, Fraisse found that 42.3% of 
Ss use two groups and 32.9% three. The 


former method was most effective when 
judged by the total number of correct 
items. In an earlier study by the same 
author (Fraisse, 1937), five items per 
group were found to be optimal. The 
same essential results are found in the 
running memory span (Pollack et al., 
1959) where instructions to S to group 
digits by fours significantly increased the 
length of the running memory span. 


Effect on Serial Position Curves 


Several experiments on immediate 
memory have shown that the shape of 
the serial order curve depends greatly 
on the material used (Deese & Kaufman, 
1957), upon the instruction to S (Broad- 
bent, 1957; Brown, 1954; Kay & Poul- 
ton, 1951), and upon grouping (Fraisse, 
1945; Pollack, 1952; Waugh, 1960). 


Material and Grouping 


The question of serial order in im- 
mediate memory was investigated in 
1926 (Robinson & Brown, 1926). The 
serial order curve usually shows better 
recall of the early elements than of the 
middle items and some increase in recall 
of the last item over those immediately 
preceding it. Fraisse (1945) sought to 
determine the reason for the incr 
efficiency of recall of the first and par- 
ticularly of the last element, which was 
the best recalled in his study. The 
natural groups formed by Ss were 
studied and it was concluded that the 
first and last elements were the most 
common focal points for the organiza- 
tion of groups designed to facilitate re- 
call. Pollack (1952) and Waugh (1960) 
confirm these findings and suggest that 
the immediate memory span is itself 
only the combination of a primacy group 
and a recency group. Waugh (1960) con- 
tends that the primacy group is facili- 
tated through intensive rehearsal, while 
the final or recency group is facilitated 
through a decreased time in store. 

The normal serial order effects can 
be modified by the type of material used. 
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In a study (Deese & Kaufman, 1957) 
which involved measurements of im- 
mediate recall with seven levels of ap- 
proximation to English, serial order 
curves were obtained at each level of 
organization. The curve for the zero 
order approximation shows a marked 
recency effect, When, however, the ap- 
proximation to English is increased, the 
serial curve undergoes a slow transfor- 
mation until it shows a large primacy 
effect and relatively little recency effect. 
This interesting study seems to accord 
well with Fraisse's notion (1945) that 
groupings made by Ss are the key to the 
curve shape. This might be interpreted 
as meaning that when the individual sees 
a connected flow of words he naturally 
starts to work from the beginning, the 
most familiar place with connected 
language. With disconnected items either 
the results vary with instruction, types 
of Ss, etc, or the final part of the 
series is emphasized, due either to its 
use as a focus for grouping or to the 
effect of lessening of time in store. This 
type of analysis receives some confirma- 
tion from the results of Gibson and 
Raffel (1936) who find little primacy ef- 
fect and a large recency effect in the 
immediate recall of visual forms. An 
important study involving criterion 
learning (Postman & Phillips, 1954) in- 
dicates that the serial order effects of 
primacy and recency are apparent in 
memorization by intentional Ss, but that 
only recency occurs with incidental 
learning. 


Instructions 


With the emphasis that the studies re- 
viewed above have given to the effects 
of organization by S on amount recalled, 
primacy, recency, and other order phe- 
nomena, it seems important to consider 
some studies which show differences in 
serial order with varying instructions. 
A study (Kay & Poulton, 1951), certain 
aspects of which have been discussed 


previously, varied S’s knowledge with 
regard to the order of reproduction of 
eight items. If S knew that he was to 
reproduce these items in the order given, 
he produced a typical serial order curve 
which showed primacy and recency, but 
with the first half of the series better 
retained. If, however, he was presented 
with the series such that he was not 
aware until after reception whether his 
reproduction was to be in the order re- 
ceived, or if he was first to recall Items 
5 through 8, followed by 1 through 4, 
and when he was actually instructed 
after reception to recall in the order re- 
ceived, his curve showed much greater 
retention of Item 5 and no overall effect 
of recency. In effect, the curve was two 
first halves of a serial order curve, one 
starting with the first item and the other 
with Item 5, but with the second half 
below the first. 

Ina study using binaural presentation 
of digits (Broadbent, 1957), the effect of 
S's knowledge and experience on se- 
rial order curves was strikingly demon- 
strated. When the recall order was pre- 
Scribed before presentation, the normal 
order curve was obtained, When, how- 
ever, S was not informed until after 
presentation, it was found that unprac- 
ticed Ss showed normal serial order 
curves, while practiced ones reproduced 
the two halves virtually alike. The shift 
from normal serial order to equal recall 
was accompanied by an increase in effi- 
ciency of retention. This lack of serial 
order preferences in Ss who did not 
know the order of reproduction was also 
found by Brown (1954) although his 
data did not show the shift with practice. 
Broadbent (1958) interprets his results 
to mean that Ss learn to start active 
rehearsal of the first items received even 
iu the absence of knowledge about the 
order, because such active organization 
leads to better overall scores even if the 
recall order sometimes is reversed. 
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Effect on Types of Errors 


Conrad (1959) analyzed data from a 
number of studies of immediate recall of 
digits. Four types of errors emerge from 
these studies: transposition, omission, 
substitution, and serial intrusions. 
Transpositions account for 50% of the 
errors and are dominant for people who 
score well rather than those who score 
poorly. It is interesting that transposi- 
tion errors are errors of order and not of 
content. Conrad (1959) and Brown 
(1958) both suggest that order preserva- 
tion has a unique status and should be 
studied separately from content recall. 
Omission errors are, perhaps, the most 
difficult to determine and the most de- 
pendent on the method of reproduction 
and the instructions used. Conrad ana- 
lyzed immediate memory data in a con- 
fusion matrix to determine if substitutes 
of one digit for another were genuinely 
random. The only substitution signifi- 
cantly greater than chance was the num- 
ber three in place of two. 

The final type of error considered was 
the serial intrusion of items from previ- 
ous series. Conrad (1960a) found that 
substitution from the same position of 
a previous list occurred more often than 
it would by chance. The substitutions 
were more closely related to S’s repro- 
duction of a previous message than to 
the actual message. Conrad (1960a) 
varied the interval between successive 
series from 15 to 40 seconds, He found 
that the number of intrusions Was re- 
duced to half, but that the overall num- 
ber of errors remained constant, illustrat- 
ing the lack of a causal relation between 
intrusions and error. Proactive effects 
of material in immediate memory have 
also been studied by Murdock (1961) 
and Peterson (1961). Murdock (1961) 
found that half of all errors were intru- 
sions from the words read immediately 
prior to the one which was to be recalled 
and that an additional third was from 
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previous trials of the session. The find- 
ings agree with Conrad (1960a) that the 
effects are of relatively short duration. 
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The viewpoint, first proposed by R. A. Fisher, of nonparametrically 
interpreting the analysis of variance of randomized experiments is pre- 


sented and the rationale illustrated. 


The general reader of Anderson’s 
(1961) cogent review in a recent issue 
of the Psychological Bulletin is likely 
to infer that nonparametric or “more- 
or-less distribution-free tests" are re- 
stricted to tests of contingency tables, 
rank-order, and various median-type 
tests. In any event, the reader will very 
probably conclude that, by contrast with 
the nonparametric, parametric tests are 
associated to a considerable extent with 
the analysis of variance and "assume 
equinormality, i.e., normality and some 
form of homogeneity of variance" (An- 
derson, 1961). The fact is that randomi- 
zation in an experimental design is an 
essential ingredient, basic to the statisti- 
cal inference. Moreover, the physical act 
of randomization ensures that the usual 
analysis of variance significance tests 
are, to a good approximation, nonpara- 
metric, and the reader should be thus 
informed, Perhaps of greater relevance, 
the same statement may be made about 
confidence interval estimation. 

Historically, Fisher (1960) appears 
to have originated the method. In his 
1935 edition (as well as in succeeding 
editions such as the seventh) of Design 
of Experiments, he analyzed nonpara- 
metrically the data of an experiment by 
Charles Darwin which was arranged in 
a matched pairs design (commonly re- 
ferred to as a "randomized block de- 
sign," and also termed [Lindquist, 1953] 
a "treatments by levels design") by cal- 


1 This note was prepared in connection with 
studies supported in part by the Research 
Section of Hastings State Hospital, Minnesota, 
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culating exactly a significance probabil- 
ity of 5.267% (computational details of 
the technique are to be found, e.g., in 
Siegel, 1956, pp. 88-91). By contrast, 
the usual method of analysis which as- 
sumes normality of the parent popula- 
tion distribution, i.e., the Student / test 
for matched pairs, resulted in a signifi- 
cance probability of 5.158% corrected, 
and 4.970% uncorrected. (Correction is 
based upon an extension to the £ distri- 
bution of Yates’ chi square adjustment 
for discontinuity [Fisher, 1960, pp. 47- 
48].) Fisher (1960) notes that "the 
physical act of randomization . . . af- 
fords the means . . . of examining the 
wider hypothesis in which no normality 
of distribution is implied" (p. 45; 
italics added). 

Earlier, Eden and Yates (1933), 
stimulated by Fisher, performed a simi- 
lar empirical analysis with respect to the 
Fisher F test, Theoretical work on the 
goodness of the approximation of the 
exact discrete variance ratio distribution 
by means of the continuous Snedecor- 
Fisher F curve was carried out by Welch 
(1937) and by Pitman (1937a, 1937b). 
Also relevant are asymptotic results on 
the limiting distribution of the variance 
ratio, such as those of Wald and Wolfo- 
witz (1944) for the randomized block 
design and of Silvey (1954) for the 
completely randomized design. 

It may be helpful to point out an anal- 
ogy. The relationship of the continuous 
F distribution to the exact discrete 
randomization F distribution is analo- 
gous to the case, well known to psy- 
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chologists, of the relationship of the con- 
tinuous normal distribution to the exact 
discrete binomial distribution—the latter 
relationship being familiarly referred to 
as the “normal approximation to the 
binomial.” In both instances, no doubt 
further empirical and theoretical work 
will be forthcoming. 

A pictorial representation is possible. 
Thus Figure 1 illustrates geometrically 
the approximation of an exact randomi- 
zation £ distribution (the histogram) by 
the corresponding continuous £ distribu- 
tion (the smooth curve superimposed on 
the histogram). The essential details of 
this concrete example follow.’ 

The data upon which the exact distri- 
bution in Figure 1 is based are derived 
from 7 of the 15 matched pairs of plants 
in Darwin’s classical experiment (as re- 
counted by Fisher, 1960 in the refer- 
ence alluded to previously) and are pre- 
sented in Table 1. (The 7 pairs con- 
sidered here are a random subsample of 
the 15 pairs given in Table 1 on page 30 
of Fisher’s 1960 study. A discussion 
analogous to that given below applies, 
of course, to the entire sample of 15 pairs. 
An extract, rather than the entire origi- 
nal data, was used here solely for the 


2 The author is indebted to B. P. Hsi for 
completing the onerous arithmetic involved in 
the construction of Tables 2-6. 
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purpose of easing the arithmetical labor 
of the illustration.) In each matched 
pair one plant was cross-fertilized, the 
other plant being self-fertilized. The 
pertinent response measure is difference 
in height between members of the same 
pair. Assuming for illustrative purposes 
(as does Fisher) that randomized allo- 
cation was used, i.e., assuming that the 
plants of each pair were randomly allo- 
cated to the two experimental conditions, 
then the total number of possible 
samples, or experimental outcomes, is 2” 
or 128. 

On the hypothesis to be tested, the 
t value corresponding to each of these 
samples can be calculated. The exact 
randomization distribution consists of 
these 128 ż values, and this distribution 
is approximated by Student's continuous 
t curve with df=6 (ie, n—1 df where 
n=7, the number of matched pairs). 
Thus, in order to construct the exact 
distribution, it should be observed that 
the obtained results, Table 1, constitute 
one of the 128 possible experimental con- 
figurations. The remaining 127 possible 
samples generated by randomization may 
now be listed and the ¢ statistic com- 
puted for each such sample. For ex- 
ample, in Table 1 the observed £—1.05. 
If, however, the sixth pair of plants had 
been randomly allocated oppositely—the 


TABLE 1 
Hetcuts (inches) oF MATCHED Pairs OF PLANTS 
Coded 
i Crossed Selfed Difference difference 
is (inch) 
ird i p 23 
1 (third in Pot II) 21% 1858 2 
2 (second in Pot III) 20% 15% 548 in 
3 (first in Pot IV) 21 18 : a 
4 (fourth in Pot IV) 12 18 = pr 
5 (first in Pot I1) 22 2 1 s 
6 (third in Pot I) 21 
(third in Pot III) 18% 1618 1% 14 


Miian Pot ^ i SUE ae 


Nore.—Extract of the Darwin data. 
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TABLE 2 


Exact RANDOMIZATION £ DISTRIBUTION 
GROUPED, CORRESPONDING TO 
THE DATA OF TABLE 1 


t Frequency Relative 

frequency 
3.5 to 4.5 1 .008 
2.5 to 3.5 3 023 
1.5to 2.5 7 055 
0.5 to 1.5 29 227 
— 0.5 to 0.5 48 375 
—15 to —0.5 29 227 
—25to —1.5 7 055 
—3.5 to —2.5 3 023 
—45 to —3.5 1 008 


other pairs remaining as indicated— 
then, on the null hypothesis of equality 
of treatment effect, #=.81. In this fash- 
ion the entire set of 128 randomization 
t values can be generated. (It should be 
noted that the total number of samples 
for the entire Darwin sample [as ana- 
lyzed by Fisher in the application al- 
luded to at the outset] is 215 or 32,768, 
the corresponding ¢ curve having 14 df. 
Obviously, the discarding of 8 pairs, and 
so 32,640 samples, greatly increased the 
tractability of the calculations involved 
in the illustration.) 

The resulting ¢ distribution corre- 
sponding to the data of Table 1 is pre- 
sented, in grouped form, in Table "4 


Relative Frequency 


Oc NOE US nm 


-5 -4 -3-2 4 OQ 
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which furnished directly the information 
needed to construct the histogram of 
Figure 1, Symmetry is seen to be a 
property of the exact distribution. The 
mathematical form of the corresponding 
Student ¢ curve for 6 df is p(t)= 
[.38265] - [12-17]? which is the basis 
of the smooth curve—also symmetric— 
of Figure 1. For the seven pairs of Table 
1 the computed / value is 1.05. Inter- 
polation in Table 2 yields a significance 
probability (two-sided) of 37.64%. 
There is, of course, a slight grouping 
error in this interpolation procedure. 
This may be obviated by working with 
the ungrouped distribution, from which 
Table 2 was constructed, with the result 
that the exact significance probability 
is 35.16%. (The computational details 
may be simplified; cf. Siegel, 1956.) 
Consulting the Student ¢ tables for 6 df, 
the approximating significance proba- 
bility for a ¢ of 1.05 is 34.64%. 
Finally, the general situation of three 
or more treatments which gives rise to 
the variance ratio F distributions—viz., 
the exact discrete and also the approxi- 
mating continuous—can be developed in 
an entirely analogous fashion. In fact, 
even the present two-treatment setup 
suffices to illustrate the nature of the re- 
sults, That is, as is well known, entirely 
equivalent to the foregoing / test of the 


Approximating Continuous t, Curve 


Exact Randomization t Distribution 


AS 


Value of t 


Fic. 1. Histogram representation of Table 2 and corresponding continuous approximation. 
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TABLE 3 


ANALYSIS OF VARIANCE AND CORRESPONDING F TEST OF THE 
Ranpomizep Brock Desicw Data or TABLE 1 


Source of variation df SS F 
—A e 
Blocks (“pairs”) 1—1—6 3487 1.10 
Treatments (“crossed versus selved") 2—1=1 435 
Error (7—1)(2—1) =6 2368 
Total 14—1—13 6290 


matched differences is the corresponding 
analysis of variance F test of the null 
hypothesis, as illustrated in the present 
case in Table 3. In general, the square 
of a t statistic with z' df is an F statistic 
with 1 df for the numerator and n' df 
for the denominator (cf. Lewis, 1960, p. 
319). Here n'=6 and, within rounding 
error, t= (1.05)?*—1.1025—F;, 6 from 
Table 3. By thus calculating the 128 
analyses of variance corresponding to 
the 128 experimental outcomes, the 
randomization distribution can be de- 
duced for the variance ratio, viz, F= 
MS treatment/ M Serrory as presented, in 
grouped form, in Table 4. The randomi- 
zation F distribution of Table 4 is pre- 
sented in graphical form as the positively 
skewed histogram of Figure 2. The 
mathematical form of the corresponding 
Snedecor-Fisher F curve for 1 and 6 df is 


p(F) =[.38265]F-/?[1+4F) 7” 


TABLE 4 


Exact RANDOMIZATION F DISTRIBUTION, 
GROUPED, CORRESPONDING TO 
THE DATA oF TABLE 1 


Relative 
frequency 


625 
488 


031 
031 


which is the basis of the smooth (posi- 
tively skewed) curve asymptotic to both 
axes portrayed in Figure 2. 
Parenthetically, the present situation 
provides a convenient opportunity to 
illustrate the important concept of an 
unbiased design (cf. Folks & Kemp- 
thorne, 1960); viz., a design in which 
there is equality of the expected values— 
over their randomization distributions— 
of the numerator and denominator mean 
squares of the F test (of treatment ef- 
fects) under the null hypothesis. Tables 
5 and 6 present, in grouped form, the 
randomization distributions of “mean 
square for treatments” and “mean square 
for error,” respectively. It may be read- 


TABLE 5 


Exact RANDOMIZATION Mean SQUARE FOR 
TREATMENTS DISTRIBUTION, GROUPED, COR- 
RESPONDING TO THE NUMERATOR OF 


F 1N TABLE 4 
—————— 
TARA Midpoint | Frequency 
ee Man oce 
0-100 50 48 
100-200 150 20 
200-300 250 8 
300-400 350 4 
400-500 450 12 
500-600 550 2 
600-700 650 10 
700-800 750 2 
800-900 850 4 
900-1000 950 2 
1000-1200 1100 6 
1200-1600 1400 6 
1600-2200 1900 4 
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RELATIVE FREQUENCY 


ere et ey 


Exact Randomization F Distribution 


RICHARD B. McHUGH 


Approximating Continuous F Curve 


7 I2 


VALUE OF F 
Fic. 2. Histogram representation of Table 4 and corresponding continuous approximation. 


ily verified that the average of each 
distribution is, within rounding error, 
identical—the common value (here 400) 
being o°, the theoretical randomization 
error variance per experimental unit. 
A useful summary of the present note 
is provided by the following pronounce- 
ment of Kempthorne (1955): "Tests of 


TABLE 6 
Exacr RANDOMIZATION MEAN SQUARE FOR 
Error DISTRIBUTION, GROUPED, CORRE- 
SPONDING TO THE DENOMINATOR 
oF F IN TABLE 4 
m 


MS for error Midpoint | Frequency 
100-150 125 2 
150-200 175 2 
200-250 225 4 
250-300 275 14 
300-350 325 6 
350-400 375 30 
400-450 425 32 
450-500 475 36 
500-550 525 2 


significance in the randomized experi- 
ment have frequently been presented by 
way of normal law theory whereas their 


validity stems from randomization 
theory” (p. 947). 
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A THEORETICO-HISTORICAL REVIEW OF THE 
THRESHOLD CONCEPT' 


JOHN F. CORSO 
Pennsylvania State University 


This paper traces the concept of threshold from its classical beginnings 
and shows the relation of the concept to selected issues in contemporary 
psychology. Emphasis is placed on three main problems: the designa- 
tion of the origin point on a psychological continuum; the interpretation 
of the sensory threshold as an intervening variable and the issue of sensory 
continuity-noncontinuity; and the specification of the response threshold 
as a dependent variable of behavior, rather than an index of organismic 
sensitivity. Reference is made to adaptation level theory and the theory 
of signal detection as possible approaches to the development of a complete 
psychophysics which does not start from the concept of threshold. 


A few years ago, a symposium was 
sponsored by the American Psychologi- 
cal Association (Chicago, 1960) to honor 
Fechner (1860) for his monumental 
work and to celebrate the centennial an- 
niversary of the birth of psychophysics. 
At the same time, however, some newer 
concepts were advanced to supplant 
those originally proposed by Fechner. 
One of these involved a revision of his 
logarithmic psychophysical law (Stevens, 
1961a) and another raised a question 
concerning the existence of sensory 
thresholds (Swets, 1961). The purpose 
of the present paper is to provide a 
theoretico-historical review of the thresh- 
old concept as it relates to these and 
other selected problems in psychology. 

The attempt of this paper is not to 
derive a new or revised version of the 
threshold concept, nor to abandon it, but 
to analyze the manner in which this con- 
cept has been used in the past and pres- 
ent history of psychology. It is believed 
that by reviewing the literature to as- 
certain the various meanings of the term, 
the issue of threshold should become 
clearer, thereby forestalling some irrele- 


1 The first draft of this paper was presented 
originally at the second annual meeting of the 
Psychonomic Society, Columbia University, 
New York City, September 2, 1961. 
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vant arguments which might otherwise 
be generated. 


THE PROBLEM oF ESTABLISHING THE 
BEGINNING PorNT OF SENSATION 


Although the doctrine of “degrees of 
consciousness,” and of the “uncon- 
scious,” can be traced as early as 1714 
to Leibnitz (1890), the term “thresh- 
old” was introduced into psychology 
by Herbart (1824)? who defined the 
"threshold of consciousness" as that 
"boundary which an idea appears to 
Cross as it passes from the totally in- 
hibited state into some (any) degree 
of actual ideation.” The general notion 
was advanced by Herbart that intensive 
ideas could be made to disappear below 
the threshold of consciousness by a 
process of inhibition, This could occur, 
however, in Herbart’s formulation only 
to the weakest of three ideas of un- 
equal strength. Given two simultaneous 
ideas of unequal strength, neither would 
be able to suppress the other below the 
threshold, or limen. While Herbart did 


* The term Schwelle (threshold) appears to 
occur initially in Herbart's Psychologie Bemer- 
kungen zur Tonlehre published in 1811 and 
reprinted in 1889 (see Herbart, 1889). The 
Psychologie (Herbart, 1824), however, presents 
the threshold concept in detail and explains the 
methods of calculation. 
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not attempt to support his imaginary 
mechanics of ideas with experimental 
data, he proposed an equation for de- 
termining the rise time of ideas which 
clearly anticipated the problem of the 
psychophysical measurement of sensa- 
tion. 

The Herbartian tradition was con- 
tinued by Lotze (1852) but it remained 
for Fechner (1860) to establish the 
quantitative meaning of the term thresh- 
old by using it in his measurement 
formula. Fechner's major concern was 
the problem of ascertaining the func- 
tional relation between two series of 
phenomena (mind and body) and, in 
particular, to establish the law which 
would describe the manner in which the 
intensity of “mental” activity varied 
with changes in the intensity of its un- 
derlying physical activity. He ap- 
proached the problem by establishing a 
metric principle of sensitivity in which 
sensitivity was related to organic ir- 
ritability or excitability, i.e., the or- 
ganism’s capacity to respond to stimula- 
tion. Fechner reasoned that sensation 
(a mental magnitude) could not be 
measured directly and, therefore, had to 
be approached indirectly by way of 
sensitivity. In “measuring sensitivity," 
he distinguished between (a) absolute 
sensitivity, or the inverse of the stimu- 
lus magnitude sufficient to give rise to 
a particular sensation; and (b) differ- 
ential sensitivity. Differential sensitivity 
was expressed in either of two ways: as 
simple or absolute differential sensitivity, 
by noting the inverse value of the abso- 
lute difference between two stimuli which 
could arouse two (just noticeably) dif- 
ferent sensations; or as comparative or 
relative differential sensitivity, by noting 
the inverse value of the ratio of these 
same two stimulus magnitudes. 

From these fundamental psychophysi- 
cal metrics and a few simplifying as- 
sumptions, Fechner proceeded to derive 


his measurement formula: y=K (log 8 
—log b), in which y is the magnitude of 
the sensation, K is a constant, £ is the 
magnitude of the stimulus, and 5 is the 
absolute threshold value of the stimulus. 
Fechner believed that the sensation y 
disappeared at the threshold value of the 
stimulus, i.e., when 8—5, and that each 
sensation “is built up (in equal incre- 
ments) from the zero point of its exist- 
ence.” The issue at this time does not 
concern the validity of Fechner’s law, 
but centers around the problem of the 
“zero-point”—the problem of establish- 
ing the beginning point of sensation. 
Fechner was well aware of the difficul- 
ties present in measuring the absolute 
threshold (stimulus limen) and devel- 
oped elaborate psychophysical methods 
for establishing the limen in statistical 
terms. The absolute threshold is now 
usually defined as “that low stimulus 
quantity that arouses a response 50 per 
cent of the time” (Guilford, 1954). This 
means, however, that stimulus values 
lower than the absolute threshold will 
also be able to arouse reportable re- 
sponses; these will occur less than half 
the time but, nevertheless, a measurable 
percentage of the time. Hence, what 
value of the stimulus should be desig- 
nated as corresponding to the beginning 
point of sensation? For Fechner, the 
answer was the absolute threshold; at 
the liminal value of the stimulus, sensa- 
tion vanished, In this approach the 
threshold is defined arbitrarily in statisti- 
cal terms; consequently, the problem 
which arises is that any change in the 
criterion percentage of response will 
change the zero point of the postulated 
psychological continuum in relation to 
the underlying physical continuum. 
This problem has plagued contempo- 
rary psychologists, particularly those in- 
volved in scaling sensory magnitudes. 
For example, Steven (1961a) has pro- 
posed a psychophysical power function 
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to replace Fechner's logarithmic relation 
between stimulus magnitude and sensa- 
tion. The general form of the power law 
is V —K ($—9,)", where Y is subjective 
magnitude, 4 is stimulus magnitude, K is 
a constant, n is an experimentally de- 
termined exponent for a given perceptual 
continuum, and d, is the “effective 
threshold.” Supposedly, ®, is some finite 
stimulus magnitude which denotes the 
beginning point for the related scale of 
sensory magnitude; but, unfortunately, 
the meaning of d, (“effective thresh- 
old”) has not as yet been given in terms 
of experimental operations. It would 
appear, however, that the procedure for 
establishing the value for ®, should be 
related in some manner to one of the 
methods available for determining the 
conventional absolute threshold, with 
perhaps an adjustment of the criterion 
percentage to a value below 50%. Thus, 
as in Fechner’s case, the sensation V 
would reduce to zero at the “threshold” 
value of the stimulus, i.e., when $—3,, 
and the beginning point of sensation 
would again be only an approximation. 
As an additional point on 9,, it should 
be noted that for stimulus values well 
above the absolute threshold, the effect 
of small changes in ®, will not signifi- 
cantly alter the values of the power 
function; however, the effect will be con- 
siderably more pronounced as the stimuli 
become smaller and approach 4, in 
magnitude. The importance of specify- 
ing an empirically-derived value for ©, 
is that it permits a more valid test of 
the “power law" hypothesis, It does not 
appear justifiable in attempting to test 
the hypothesis to “adjust” the value of 
®, to that “constant value whose sub- 
traction from the stimulus values suc- 
ceeds in rectifying the log-log plot of 
the magnitude function" (Stevens, 
1961b). Psychophysical laws are matters 
of fact, not expediency. 
Psychophysicists have yet to provide 
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an adequate description of the events 
which occur on the assumed sensory 
continuum as the physical stimulus pro- 
ceeds from zero to some particular value 
which produces the criterion percentage 
of responses; yet, the issue is clear. A 
distinction must be made between the 
zero point of sensation which represents 
the starting point for a given psycho- 
logical dimension in accordance with 
some psychophysical law and the arbi- 
trary beginning point of sensation con- 
ventionally defined by the absolute 
threshold. Barrell (1900) was aware of 
this problem and indicated that it would 
be desirable, if possible, “to bridge over 
the hiatus by a (psychophysical) for- 
mula which (would) extend down to the 
origin." 

Several attempts in this direction have 
recently been made. Michels and Helson 
(1949) have proposed a restatement of 
Fechner's law which avoids the difficul- 
ties inherent in the specification of a 
value for Fechner's b and Stevens’ 4,. 
In this formulation S=K log (R/A), 
where S is the magnitude of the sensa- 
tion evoked by the stimulus R, K isa 
constant, and A is the adaptation level 
of the observer as determined in a given 
session by prior experience and other 
organismic factors, contextual stimuli, 
and experimental stimuli, It should be 
noted that while A is a constant in this 
formulation, it is derived “from the ac- 
tual conditions of observation and varies 
with series stimuli, backgrounds, stand- 
ards or anchors, etc." to which the sub- 
ject is exposed in a given setting. This 
is probably the relativistic notion to 
which Stevens alludes in his term effec- 
tive threshold. In this case, however, the 
adaptation level prevailing in a given 
experiment is not arbitrarily “adjusted,” 
but can be determined from several ex- 
perimental and computational opera- 
tions. Another favorable consequence of 
the Michels and Helson interpretation of 
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Fechner's law is that the problem of 
"negative" sensations is avoided ; accord- 
ing to adaptation level theory, stimuli 
above the adaptation level arouse one 
kind of response, stimuli near the adap- 
tation level evoke indifferent responses, 
and stimuli below the adaptation level 
elicit opposite types of response. Since 
the adaptation level is approximated as 
the weighted logarithmic mean of all 
stimuli affecting the organism at a given 
moment, the judgment of any stimulus 
is relative to (and depends upon) the 
adaptation level rather than the absolute 
threshold which is fixed for a given set of 
conditions. The developments of adapta- 
tion level theory (Helson, 1959) cover 
such a wide range of behavioral phe- 
nomena (sensory, psychophysical, social, 
cognitive, learning, and clinical) that a 
significant unifying principle seems to 
have been uncovered which can bridge 
the compartmentalization of problems so 
characteristic of contemporary psychol- 
ogy. 

Ekman (1959), exploring the tenabil- 
ity of a psychophysical power function 
for various continua, proposed the form 
S=c (R+a)", where S is the magnitude 
of the sensation evoked by the stimulus 
R, c is a constant related to the unit of 
measurement, and a and z are constants 
to be experimentally determined. In an 
earlier statement of this function, -Ro 
(the absolute threshold) was used in 
place of +a; however, estimates based 
upon the treatment of fractionation data 
repeatedly yielded negative values for 
R,. This destroyed the interpretation of 
R, as the absolute threshold. Conse- 
quently, the symbol Ro was replaced by 
the arbitrary symbol a which is expected 
to be positive in most experiments. The 
point to be noted for purposes of this 
paper is that the subjective magnitude 
aroused by an external stimulus R is not 
measured from the value produced by à 
threshold stimulus, but is measured from 


the value produced by "some stimulus" 
with magnitude a. 

It appears, therefore, that some prog- 
ress is being made on the problem of re- 
lating the beginning point (or a neutral 
point) of the psychological continuum 
to some value of the physical stimulus 
other than the conventional absolute 
threshold. The conventional absolute 
threshold for a given stimulus dimen- 
sion and the zero point of sensation for a 
specific logarithmic or power function 
for that dimension are based on two 
different sets of operational procedures. 
This means that two different concepts 
may be involved and the empirical re- 
sults obtained from the two procedures 
need not be identical. In fact, the pres- 
ent statement of the problem precludes 
such a possibility, since the sensation at 
the absolute threshold (as convention- 
ally defined) must have acquired a cer- 
tain finite magnitude, ie., must have 
passed the zero point for that particular 
sensory continuum. 


Tur SENSORY THRESHOLD AS AN INTER- 
VENING VARIABLE AND THE 
CoNTINUITY-NONCONTINUITY 
ISSUE 


The classical notion of threshold 
(stimulus limen) as held by Fechner 
(1860) referred to a point, not exactly 
constant but nearly so, above which 
sensory differences could be detected and 
below which the differences vanished 
into the “unconscious.” Fechner postu- 
lated the existence of four continua: à 
stimulus continuum (physical process), 
an excitation continuum (physiological 
process), à sensory continuum (mental 
process corresponding to the excitation in 
a sensory center), and a judgment con- 
tinuum (report of the observed compari- 
son). The threshold was irretrievably 
imbedded somewhere between the excita- 
tion and the sensory continua. The in- 
coming stimulus did not find the brain 
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(in its waking state) physiologically 
empty, but already occupied by some 
sort of neurological excitations. To be 
sensed, the incoming stimulus and its 
corresponding excitations had to be suffi- 
ciently larger than the neurological ac- 
tivity residually present. 

Thus, a liminal stimulus or a liminal 
stimulus difference was one which lifted 
the sensation or the sense-difference over 
the threshold of consciousness. This 
"barrier" notion of threshold reflected 
the thinking of Herbart (1824). It was 
perpetuated by Titchener (1905) who 
held a somewhat similar view and pro- 
posed that the “frictional resistance" 
characterizing each sense organ had to 
be overcome before the application of 
a stimulus would produce a correspond- 
ing change in sensation. Licklider (1951) 
contended that “in the simplest con- 
ceptual neurology, the stimulus threshold 
owes its existence to the effect of a small 
barrier . . . between successive stages 
in the neural processes . . . ." In this 
classical view, the threshold has a direct 
sensory reference, hence the notion of 
Sensory threshold. Stated in current 
terminology, the classical concept of 
threshold may be considered as an inter- 
vening variable referring to a postulated 
sensory continuum. The concept of 
adaptation level falls in this same cate- 
gory (Helson, 1959). 

The notion of residual neural activity 
which underlies the classical concept 
of threshold has recently been revived. 
Ekman (1959) appealed to sensory noise 
resulting from spontaneous neural activ- 
ity when he found that in psychophysi- 
cal scaling S (subjective magnitude) 
was greater than zero even though no ex- 
ternal stimulation was present. This sug- 
gested another form of the power func- 
tion: S=S, +R", where S, is the basic 
perceptual noise to which is added the 
subjective magnitude produced by an 
external stimulus R. The idea of a basic 
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level of organismic activity is present in 
Helson’s (1947) hypothesis of adapta- 
tion level, in which adaptation level is 
considered to shift along a given stimulus 
continuum as a function of all past and 
present factors influencing behavior. The 
results of a considerable number of 
human and animal electroencephalo- 
graphic studies tend to support the Fech- 
nerian notion of residual neural activity, 
For example, Jasper (1936) has shown 
that significant afferent stimuli will pro- 
duce changes in the electrical activity of 
the cortex depending upon the excitatory 
state of the cortex at the time of arrival 
of the stimulus. In a study of single 
neurons in three cerebral regions of the 
cat, Evarts, Bental, Behari, and Hutten- 
locher (1962) have demonstrated that 
there is a greater variance of spontaneous 
discharge rates during waking than dur- 
ing sleep. Thus, the assumption of neural 
“noise” which is found in the develop- 
ment of several psychophysical theories 
seems reasonable when viewed from a 
neurophysiological standpoint. 
Fechner’s concept of threshold gave 
rise to what may currently be called the 
sensory continuity-noncontinuity issue. 
Does sensory excitation increase in a 
smooth and continuous manner or is 
there an abrupt step-like change from no 
sensation to sensation, or from sensation 
to a difference in sensation, as the value 
of the physical stimulus is increased 
continuously along some specified dimen- 
sion? Fechner (1860), Lotze (1884), 
and others, held to the noncontinuity no- 
tion which embraced the threshold con- 
cept; but Delboeuf (1883), Pierce and 
Jastrow (1885), Miiller (1896), and 
others, denied the supposed fact of 
threshold, Jastrow (1888) claimed that 
the threshold was a misconception in- 
troduced by Fechner’s Method of Just 
Observable Difference. He argued that 
the sensory continuum consisted of “a 
continuous series of intermediate degrees 
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of clearness and (that) there (was) no 
point on the curve with characteristics 
peculiar to itself, no threshold in any 
true sense." Jastrow believed that sen- 
sation and stimulation each formed a 
continuum and that applying discrete 
notions to them would lead to hopeless 
confusion. 

The numerous experimental studies 
which followed on problems of sensory 
discrimination soon made it evident that 
the threshold at least was not a rigidly 
fixed point or line. The same stimulus, 
even when applied in the same manner, 
was sensed more or less strongly by one 
observer than by another, and by the 
same observer at different times. Con- 
sequently, attempts were made to define 
the various sources of sensory variability 
and to establish the form of the mathe- 
matical distribution of errors in psycho- 
physical judgments. 

Through the work of Jastrow, Cattell, 
Urban, Fernberger, Thomson, George, 
and others, several sources of variance 
were isolated. These included such fac- 
tors as the observer’s attitude, fatigue, 
attention, interest, and practice. As Cat- 
tell (1893) stated: 


these sources of variation sufficiently account 
for the fact that the same sensation does not 
occur [to the same stimulus]. They are, in- 
deed, so numerous and to a certain extent so 
independent, that they justify . . . the assump- 
tion . . . [that an error is composed of a very 
large number of comparatively small and inde- 
pendent errors], and the results of experiments 
show that the errors are in a general way 
distributed as required by the theory of proba- 
bility (p. 287). 


This is the underlying principle of the 
classical theory of sensory discrimi- 
nation. 

The beginnings of the classical theory 
can be traced to Müller (1878) who 
started from the notion of a non-Fech- 
nerian threshold which was subjected to 
chance variations. Müller and his follow- 
ers assumed that the appropriate math- 


ematical expression for the probabilities 
of these variations was the Gaussian ex- 
ponential law, i.e., the frequency of these 
variations was a function of their size. 
Urban (1907), relating the thresh- 
old to experimental data, emphasized 
the notion of the probability of judgment 
(smaller, equal, or greater) and called 
the mathematical expression which gave 
the probability as a function of the com- 
parison stimulus the psychometric func- 
tion of that judgment (Urban, 1910). 
The classical form of the function resem- 
bles the integral of the normal curve 
and is known as the phi function of 
gamma (®y-hypothesis) (Urban, 1907). 
A complete description of the manner in 
which the psychometric function is gen- 
erated according to the classical theory 
of sensory discrimination can be found 
in a paper by Boring (1917). 

Tn the issue of sensory continuity-non- 
continuity, the psychometric function 
is of paramount importance, since the 
specific form of the function is assumed 
to reflect the “inner workings" of the 
discriminatory mechanism. Any devia- 
tion of the psychometric function from 
its classical sigmoidal form opens the way 
for a new hypothesis about sensory dis- 
crimination. Since rectilinear psycho- 
metric functions were obtained by 
Stevens, Morgan, and Volkmann (1941), 
the neural quantum theory of sensory 
discrimination was offered as an alter- 
native to the classical theory. A detailed 
treatment of the derivation of the theory 
has been published by Corso (1956b). 

The characteristics of the psycho- 
metric functions as predicted by the 
neural quantum theory are that: (a) the 
percentage of responses between 0 and 
10096 will be a linear function of the 
size of the stimulus increments; (5) the 
slope of this function will be inversely 
proportional to its intercept; and (c) 
the smallest stimulus increment which 
elicits a response 100% of the time will 
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be twice the size of the largest increment 
which elicits a response 0% of the time, 
provided the observer has adopted a 
“2 quanta" criterion of judgment. Some 
of the technical difficulties which have 
been encountered in testing these pre- 
dictions have already been described 
(Corso, 1956b). It should also be 
pointed out that the tenability of Pre- 
diction 5 will be affected by the par- 
ticular method which is adopted for 
obtaining the judgments. Miller and 
Garner (1944) have shown that in the 
quantal method a random order of pres- 
entation of stimuli will extend the 
psychometric function, since the observer 
cannot stabilize his criterion of judgment 
and, consequently, fluctuates from 1 to 
3 quanta, Likewise, in the case of the 
phi gamma hypothesis, the slope of the 
psychometric function will be affected 
by the particular variant of the constant 
methods which is adopted, either for 
obtaining the required judgments or for 
computing the measure of precision, 4 
(Guilford, 1954). Barlow (1961), who 
suggests that rectilinear psychometric 
functions may be approximations to (or 
distortions of) ogival functions, has 
developed a hypothesis in which both the 
slope of the psychometric function and 
the associated threshold value on the 
stimulus intensity scale are dependent 
upon the variability of the instantaneous 
values of threshold and the rate of false 
positive responses. It seems, therefore, 
that any conclusions regarding the ten- 
ability of the quantal or phi gamma 
hypothesis and any generalizations con- 
cerning the basic discriminatory ability 
of the human observer which are derived 
from the slope of the psychometric 
function must be tempered with caution. 
The two theories, the classical and the 
neural quantum, reflect the present 
status of the issue which originated with 
Fechner-Jastrow. The classical theory 
holds that sensation is a continuous 
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function of stimulus magnitude, while 
the quantum theory holds that sensation 
is a discontinuous function of stimulus 
magnitude. The classical theory is 
derived from the assumption that the 
threshold of the human organism varies 
somehow (perhaps due to physiological 
and/or psychological factors) in accord- 
ance with the normal law of error; the 
quantum theory is derived from the basic 
assumption that the functional neural 
processes which mediate sensory dis- 
crimination operate in a step-wise or all- 
or-none fashion, once the threshold of 
the neural quantum is crossed. While the 
present paper is not an attempt to resolve 
the classical theory-quantum theory con- 
troversy, it perhaps should be noted that 
a review of the experimental evidence on 
the quantum theory has lead to the con- 
clusion that “unequivocal support of 
the . . . theory is, for the most part, 
lacking" (Corso, 1956b). 

The position which is usually taken 
by the quantum theorists, when negative 
evidence is obtained with respect to the 
three predictions, is that a sufficient 
reduction in the “noise” of the organ- 
ism, the equipment, or the environment 
has not been achieved by the experi- 
menter. Thus, the quantal step-function 
has not been allowed to manifest itself 
in the action of the sensory system. 
Regrettably, the successful experiment- 
ers have not been able to specify in ad- 
vance the manner in which “the noise in 
a discrimination process (can be re- 
duced) to a level low enough to reveal 
the ‘grain’ in the sensory continuum.” 
Stevens (1961c) has indicated that 
those experiments that have been successful 
have seemed to involve a carefully contrived 
arrangement designed to make the observer's 
task as easy as possible, plus a fortunate selec- 
tion of observers capable of maintaining an 


unwavering attention over extended periods of 
time (p. 808). 


Any complete treatment of the sensory 
continuity-noncontinuity issue must ulti- 
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mately include the concept of tbe 
absolute threshold, as well as that of the 
differential threshold. Quantum theorists 
have suggested, however, that since their 
model assumes a continuous fluctuation 
in the overall sensitivity of the organism, 
rectilinear psychometric functions should 
not be obtained for the absolute limen. 
Here, it is said, sigmoidal functions 
should be evidenced which reflect the 
time distribution of the organism's over- 
all sensitivity. This view opposes the 
fact that some of the earliest experi- 
mental data from which quantal notions 
were advanced were those of Békésy 
(1936) dealing with the absolute thresh- 
hold of hearing at low frequencies, Other 
experiments on absolute thresholds (De- 
Cillis, 1944) have not produced quantal 
functions and, in a replication and exten- 
sion of the Békésy (1936) study, Corso 
(1961) has shown that when methodo- 
logical artifacts are eliminated, the 
threshold of audibility assumes its usual 
continuous form. The problem of ob- 
taining quantal data in experiments on 
absolute thresholds remains a real and 
significant challenge to the quantum 
theorists. 

While the concepts of absolute thresh- 
old, differential threshold, and psycho- 
physical functions have in the present 
paper been treated in a somewhat 
isolated manner for purposes of exposi- 
tion, their independence may be more 
superficial than real. In a penetrating 
analysis, Ekman (1959) has shown that 
starting from the interpretation of a in 
his power function, a common theoretical 
framework can be worked out which 
embodies both absolute and differential 
sensitivity, as well as the psychophysical 
relation between stimulus magnitude and 
subjective magnitude. These relation- 
ships have been described by a single 
mathematical expression. At least in 
principle, the concept of absolute sensi- 


tivity can be considered to be a special . 
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case of differential sensitivity. As sub- 
stantive concepts, the two can be treated 
essentially without distinction; whether 

mechanism 


ORGANISMIC SENSITIVITY AND THE 
Response THRESHOLD AS A 
DEPENDENT VARIABLE 
or Benavion 


Up to this point, the concepts of 
absolute and differential thresholds have 
with respect 


for a stimulus to be just delet: 


psychometric function, 
value corresponding to a particular point 
on this curve, e.g., 50% or 75% depend- 

the parti tal pro- 
cedure, is taken as the threshold (Boring, 


conventional ‘threshold’ "'; 
criteria have also been used. Volkmann 
defined his measure of sensitivity as one- 
half the interval of uncertainty—the dis- 
tance on the stimulus scale between the 
upper and lower limens (Fernberger, 
1914). Jastrow (1888) rejected the 
threshold concept and adopted the 
«standard ratio" as his index, ie. the 
ratio of two stimuli which elicited one 
error in every four judgments. 
Regardless of the specific method 
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adopted by various investigators for 
computing the threshold value, two 
characteristics of the threshold were 
universally present: (a) the threshold 
was based on a probability of response 
—for Hunter (1924), a language re- 
sponse; and (5) the probability criterion 
was adopted arbitrarily—for Johnson 
(1930), a "social criterion." These con- 
siderations led to numerous objections 
to the use of the threshold measure as an 
index of sensitivity (the two measures 
are inversely related). Thomson (1920) 
objected to Volkmann's index by sug- 
gesting that the interval of uncertainty 
depended entirely on 


the subject’s readiness to give the answer 
undecided. It measures therefore rather a moral 
characteristic than a physical sensitivity . . . 
(p. 301) [the threshold] can be varied . . . at 
[the subject’s] whim; and will vary with his 
mood at the moment (p. 307). 


George (1917), Boring (1920), Guil- 
ford (1927), and others, wrote on the 
effect of attitude as it influenced psycho- 
physical judgments; but Fernberger 
(1914) made it explicitly clear that 


one cannot measure sensitivity in absolute 
terms; one can only say that a given sensitivity 
has been found to exist under certain given 
experimental conditions (p. 342). 


Newhall (1928) took a similar position 
and noted 


that the observed . . . threshold is a variable 
depending for its value upon an unknown num- 
ber of external and internal factors (p. 46). 


Fernberger (1930) wrote: 


the old idea that we are measuring the sensi- 
tivity of a particular sense organ has been 
abandoned. We now recognize that we are 
measuring the sensitivity of the entire psycho- 
physical organism—his sense organs, his con- 
centration, his attitude, his acceptance and 
understanding of instructions, his degree of 
practice, and what not besides . . . (p. 111). 


High, Glorig, and Nixon (1961) have 
been concerned with the same problem 
and have enumerated about two dozen 
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subclasses of variables which affect the 
threshold; these are grouped under four 
major categories: physical, physiological, 
psychological, and methodological, 
Considerable experimental evidence is 
now available which supports the con- 
tention that psychophysical judgments, 
on which threshold measures (and, 
hence, sensitivity) are based, are affected 
by a wide range of manipulable con- 
ditions. Some of these relate to the phys- 
ical stimulus, such as rate of onset 
(Goodfellow, 1946) in which higher 
thresholds were obtained with increased 
time delays; some relate to the phys- 
iological state of the observer such as 
sleeplessness, although Goodhill and 
Tyler (1947) found that 100 hours of 
experimental insomnia had no effect on 
hearing acuity. Among the psychological 
factors, Senders and Sowards (1952) 
have found that subjects yield propor- 
tions of judgments in accordance with 
their expectations based upon their 
knowledge of the experimental situation. 
The significance of methodological fac- 
tors has been demonstrated, among 
others, by Corso (19562) who found 
that the method of limits and the method 
of adjustment produced different audi- 
tory thresholds for the same group of 
subjects tested under both methods. In 
another methodological study, Newman 
(1933) was forced to reject the differ- 
ential threshold—just noticeable differ- 
ence (jnd)—as a unit of measurement 
when he showed that two stimuli equal 
in psychological magnitude are not 
reached by the same number of jnd's 
from the stimulus limen, This conclusion 
was found to hold for both brightness in 
vision and loudness in audition. In the 
latter case, for example, two frequencies, 
80 and 1,900 cycles per second, were 
judged to have a loudness level of 70 
phons, but the 80-cycle tone was only 
8.3 jnd's above the stimulus limen while 
the 1,900-cycle tone was 115.9 jnd's 
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above the stimulus limen. It may be 
concluded, therefore, that the classical 
notions of absolute sensitivity are no 
longer tenable; human judgments are 
relative and depend upon a wide number 
of factors which may be operating within 
a given experimental situation at a given 
time. 

This view leads away from notions 


' of threshold as substantive concepts— 


do thresholds exist or do they not— 
toward notions of threshold as con- 
ceptual tools. In this approach, the 
threshold measure is seen as a way of 
organizing data to arrive at the solution 
of a particular problem. The definitive 
statement of Graham (1950) exemplifies 
this position. He proposes that a psycho- 
physical function is a special case of the 
general equation: R=f (a,b,c,d, ... 
hy... t,o. 2,5), where a, b, c, 
and d refer to specified aspects of the 
stimulus, » is the number of stimulus 
presentations, ¢ is time, and x, y, z are 
specified as conditions of the organism. 
For Graham (1958) any psychophysi- 
cal function may be expressed as R=f 
(a) where all the variables of the general 
equation are constant, except R and a: 
Such a function is a stimulus-response relation 
that shows how a measure of response varies 
with a controlling stimulus variable. From 
such a relation it is possible to obtain a new 
datum, a value of the stimulus variable that 
corresponds to a given probability of response 
occurrence—for example, the 50% value . - . 
by virtue of this quantity, it is possible to treat 
as a single datum the outcome embodied in a 
total psychophysical function, and so derive 
... perceptual functions . . - functions ob- 
tained from a sequence of experiments con- 
cerned with finding thresholds under many 
different conditions (pp. 67-68). 


The importance of this revised orien- 
tation is that it indicates a movement 
away from the concept of threshold as 
an index of sensitivity, Instead, the 
attempt is to determine how a critical 
value of the stimulus (the response 
threshold) varies as a function of a con- 


trolling variable. Futhermore, whether 
or not a sensory threshold “exists,” the 
response threshold is established by 
means of its operational definition. Con- 
sequently, it is not appropriate to speak 
of threshold measurements ; observations 
are collected on the probability of cor- 
rect (positive) responses and the thresh- 
old is specified in terms of its operational 
definition. The definition, of course, 
contains an arbitrary selection of the 
probability criterion. In contemporary 
psychophysics, then, the concept of 
threshold is response oriented and is a 
dependent variable of behavior, unlike 
the classical sensory concept with its 
intervening variable connotation. 

The concept of a response threshold 
has found wide applicability in psychol- 
ogy in areas other than psychophysics. 
For example, Martin, Paul, and Welles 
(1915) compared reflex and sensory 
thresholds; Hull (1917) compared the 
fluctuations of threshold in the formation 
and retention of associations among the 
insane; Williams (1918) calculated the 
associative limen in certain memory ex- 
periments; Wells (1919) reported a se- 
ries of experiments involving the thresh- 
old of “conscious” learning; Oberly 
(1928) compared the “attention span” 
limen for ungrouped digits and the 
“memory span" limen for grouped digits; 
Irwin (1932) investigated the thresholds 
for the perception of differences in facial 
expression; Miller (1939) discussed the 
limen of awareness in the problem of 
discrimination without awareness; Post- 
man, Bruner, and McGinnies (1948) 
treated the duration threshold in tachis- 
toscopic presentations as an index of 
selective perception; Kissen, Gottesfeld, 
and Dicks (1957) determined the inhibi- 
tion and tachistoscopic thresholds for 
sexually charged words; and Corso 
(1959) studied changes in auditory 
thresholds as a function of age and sex. 
In each of these investigations, recourse 
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was taken to the threshold notion defined 
in a particular manner in order to deter- 
mine the responsiveness of the human 
organism under a given set of conditions. 
The general utility of the response 
threshold as a criterion measure of per- 
formance and the specific meaning which 
can be ascribed to the term by its opera- 
tional definition suggest that the notion 
of response threshold should not be 
abandoned, regardless of the outcome of 
the sensory threshold issue. 

Although the present paper is not 
concerned directly with problems of per- 
ceptual defense and related topics, it 
should nevertheless be pointed out that 
in dealing with response thresholds (as 
with other behavioral indices) the 
question of reliability of measurement 
cannot be ignored. Byrne and Holcomb 
(1962) have reported that the coefficient 
of internal consistency was .00 when the 
"split-half reliability was determined by 
dividing hostile words into odd and even 
groups and computing the differential 
threshold scores for these two groups 
compared to their matching neutral 
words." Thus, while there was independ- 
ent agreement among judges in terms of 
stimulus material and Scoring pro- 
cedures, the resulting scores were unreli- 
able. This suggests that in studies of 
perceptual defense, where the differential 
recognition threshold has served as the 
dependent variable, some of the incon- 
sistencies among findings may be attrib- 
utable to unknown reliability coefficients 
rather than to theoretical inadequacies, 

One final point remains to be men- 
tioned before closing this section of the 
paper. It relates to the Tresselt and 
Volkmann (1942) hypothesis that psy- 
chophysical judgments and conditioning 
can be explained by the same set of 
psychological principles. The point of 
contact between these two concepts lies 
in the stimulus generalization gradient; 

specifically, given a range of stimulus 
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magnitudes on a particular dimension, 
the same judgment category (or re- 
sponse) will occur to a set of stimuli 
within that range, with a particular 
stimulus magnitude from this set being 
associated most frequently with that 
judgment (or response). Guilford 
(1954) has shown that starting with two 
overlapping stimulus generalization 
gradients corresponding to two cat- 
egorical judgments, it is possible to 
determine a threshold value expressed in 
stimulus units. The threshold is taken 
as that stimulus magnitude at which the 
probability of occurrence of that 
stimulus is the same for each of the two 
judgment categories. Cartwright (1941) 
has shown, as expected from learning 
principles, that continued practice re- 
duces the range of equivalent stimuli 
that are capable of eliciting particular 
responses and Johnson (1949) has 
demonstrated that the shift in threshold 
values which accompanies changes in 
the set of judged stimuli tends to 
describe a learning function. These and 
other related studies suggest that further 
efforts in this direction may result in a 
rapprochement of psychophysical theory 
and learning theory, at least for those 
portions of learning theory related to 
problems of discrimination. 


AN APPROACH To PSYCHOPHYSICS 
WITHOUT A THRESHOLD CONCEPT 


In the presentation of the sensory 
continuity-noncontinuity issue, reference 
was made to the problem of noise in the 
psychophysical situation. While the 
quantum theorists have been busy trying 
to reduce noise in discrimination experi- 
ments, the proponents of the theory of 
signal detection (Tanner, Swets, & 
Green, 1956) have introduced noise so 
that at least that part of it external to 
the observer can be appropriately meas- 
ured. In this approach, noise is always 
present and the observer is faced with 
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the task of reaching a decision stating 
whether or not a signal was present dur- 
ing a specified time interval. Only two 
assumptions are made about sensory dis- 
crimination in the derivation of the 
theory: it varies continuously due to the 
ever-present noise in the sensory system, 
and it is considered to be a unidimen- 
sional variable insofar as it affects the 
observer’s report. The relevance of the 
theory of signal detection to the prob- 
lems under consideration in this paper 
is that the variables of the theory do 
not contain a reference to threshold 
(Licklider, 1959). 

The theory of signal detection pro- 
vides an interesting and important in- 
novation in the area of sensory discrimi- 
nation since it apparently provides a 
method which permits the separation of 
the observer’s criterion of response and 
his organismic sensitivity (Tanner & 
Swets, 1954). These two measures are 
extracted from an analysis of the observ- 
er’s operating characteristic curve. This 
curve, for different levels of response 
certainty, is simply a plot of the pro- 
portion of “yes” reports made when the 
signal is present against the proportion 
of yes reports made when noise alone is 
present. As the observer is induced by 
instructions to change his response crite- 
rion from one set of trials to another, the 
plotted proportions will trace a single 
curve running from 0 to 1.0 on both 
coordinates, The curve is characterized 
by a single parameter, d, which is the 
difference between the means of the 
signal-plus-noise and noise-alone dis- 
tributions divided by the standard devia- 
tion of the noise distribution. This param- 
eter, d, is the measure of sensitivity. 

The slope of the operating-character- 
istic curve at any point is equal to the 
value of the likelihood-ratio criterion 
which produces that point. According to 
the theory, the observer knows the 
probability with which each possible 
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“excitatory state” will occur during an 
observation interval when noise-alone is 
present and during the interval when 
signal-plus-noise is present. The ob- 
server supposedly bases his report on the 
ratio of these two quantities; this ratio 
is called the likelihood-ratio. Some criti- 
cal value of the likelihood-ratio is estab- 
lished by the observer as his response 
criterion, depending upon his detection 
goal and other relevant situational pa- 
rameters, The observer says “yes” (a 
signal was present) whenever the likeli- 
hood-ratio measured in a given observa- 
tion interval exceeds the response crite- 
rion, and “no” (only noise was present) 
whenever the likelihood-ratio is less 
than the criterion. If the observer fol- 
lows this rule, then the two independent 
measures, sensitivity and response crite- 
rion, can be derived from the fourfold 
stimulus-response matrix which results 
in a “yes-no” experiment. It has been 
found that the sensitivity measure re- 
mains relatively constant in vision 
(Swets, Tanner, & Birdsall, 1955) and 
in audition (Tanner et al, 1956) re- 
gardless of changes in the observer's 
attitude and changes in the experimental 
procedures. 

The approach of signal detection the- 
ory appears to circumvent many of the 
problems which were mentioned earlier 
in this paper and which are inherent in 
a psychophysics developed along classi- 
callines. The theory makes it possible 
to define a problem and to resolve it in 
a manner which does not rely on the 
threshold concept. In reviewing the 
experimental data from several types 
of psychophysical experiments, Swets 
(1961) concluded that: 


the existence of a sensory threshold has not 
been demonstrated (p. 176) [and that] there is 
now reason to believe that sensory excitation 
varies continuously . . . an apparent cut in 


the continuum results simply from restricting 
the observer to two categories of response 


(p. 169). 
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This is essentially what Jastrow (1888) 
had contended. A similar view was 
taken by Urban (1930) who stated that: 
the threshold [is] a superfluous hypothesis, 
in itself not more than a [mere] metaphysi- 
cal expression. . . . All the [psychophysical] 
formulae may be obtained without it , . . the 
notion of the threshold is not needed for the 
foundation of psychophysics (pp. 99-100). 


The theory of signal detection certainly 
seems to point in that direction, as does 
adaptation level theory. 


SUMMARY 


The major content of the present 
paper may be summarized briefly by in- 
dicating that: 

1. The conventional notion of thresh- 
old is inadequate for locating the be- 
ginning point of sensation on a postu- 
lated sensory continuum, 

2. The classical notion of threshold 
is sensory in nature and, as such, should 
be treated as an intervening variable; 
this notion of threshold is implicit in the 
sensory continuity-noncontinuity issue 
which finds its current expression in 
the unresolved quantal-phi gamma 
controversy. 

3. The tendency in contemporary psy- 
chophysics is to consider the threshold 
in terms of a response continuum; this 
permits an operational definition of the 
term on a probability basis and provides 
a dependent variable of performance 
which has applicability in a wide variety 
of experimental situations. 

4. The notions of sensory and response 
thresholds are separate and distinct; 
thus the issue of the existence or non- 
existence of sensory thresholds has no 
bearing on the status of response 
thresholds, 

5. The indications are, however, that 
the theory of signal detection may well 
supplant the classical threshold measure 
with its own measures of sensitivity and 
response criterion, thereby fulfilling 
Urban’s (1930) prophesy of a “psycho- 
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physics which does not start from the 
threshold hypothesis.” 

6. Consideration is given to adapta- 
tion level theory as an alternative 
approach which has the potential of 
bridging a wide range of psychological 
problems without recourse to the thresh- 
old concept. 
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This paper reviews 31 empirical studies of small groups in which the 
major independent variable, group size, was related to several classes 


of dependent variables: group perf 
tion, the nature of interaction, group 


ormance, distribution of participa- 
organization, member performance, 


conformity and consensus, and member satisfaction. Many of these 
variables were found to be significantly affected by group size, but 
methodological shortcomings characterizing this group of studies pre- 
clude the assertion of broad generalizations. Several dependable and 
nondependable intervening variables are suggested which may help to 
account for many of the observed effects, Conclusions are: group size 
is an important variable which should be taken into account in any 
theory of group behavior, and future research on group size should 
proceed more systematically than in the past. 


This report is an effort to formulate 
generalizations about the effects of group 
size from a critical review of past re- 
search and an analysis of methods and 
problems relating to this subject. It is 
focused mainly on studies of face-to-face 
groups ranging in size from 2-20 mem- 
bers, in which behavior was studied di- 
rectly by observations, questionnaires, or 
interviews. Because of their relevance a 
few studies are included that depart in 
some respect from these criteria. While 
making no claim to comprehensiveness, 
we have included all studies that we 
could locate through 1960 meeting the 
above standards. In earlier reviews of 
research relating to small groups there 
were sections on group size (Kelley & 
Thibaut, 1954; Lorge, Fox, Davitz, & 
Brenner, 1958), but no thorough review 
has been written. Studies of the size of 
families, organizations, cities, and socie- 
ties are only generally relevant in this 
context and therefore have been excluded 
(see Caplow, 1957, for a review of stud- 
ies relating to organizational size). 


1 This article reports portions of a research 
project financed by a grant from the Horace 
H. Rackham School of Graduate Studies of 
the University of Michigan. 


The studies discussed here do not 
represent an integrated attack upon a 
single problem or set of problems. In- 
stead, they deal with a wide range of 
dependent variables and are concerned 
with establishing empirical relationships 
rather than with testing the implications 
of some general theory. To order the 
findings meaningfully, we have found 
it convenient to discuss them as specific 
topics under one of two major categories: 
(a) effects on the group as a whole, 
which include group performance, the 
distribution of participation, the nature 
of interaction, and group organization ; 
and (b) effects on member behavior, 
including individual performance, con- 
formity and consensus, and member 
satisfaction. 


EFFECTS FOR THE GROUP AS A WHOLE 


Group Performance 

Ten ‘experimental studies dealt with 
the effects of group size on group per- 
formance in problem solving or judg- 
mental tasks. These findings are sum- 
marized according to four main classes 
of dependent variables: quality of per- 
formance, speed, efficiency, and pro- 
ductivity. 
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The quality of group problem solving 

was examined in four studies with mixed 
results. Taylor and Faust (1952) found 
that in playing the game Twenty Ques- 
tions, four-man groups correctly solved 
more problems than did two-man groups. 
Similarly, Fox, Lorge, Weltz, and Her- 
rold (1953) found that the quality of 
solutions to complex human-relations 
problems was significantly greater for 
groups of 12 and 13 than for groups 
of 6, 7, and 8. In contrast, Lorge and 
Solomon (1959, 1960) discovered no 
relationship between group size (2-5 
members in the first study and 3, 4, 6, 
and 7 members in the second) and 
the proportion of groups which arrived 
at the correct solution to the Tartaglia 
problem. 

The quality of group judgments based 
on collective decisions also showed mixed 
relationships to group size. South 
(1927) found no difference between 
three-person and six-person groups in 
their ability to judge emotional expres- 
sions from photographs, but he did find 
that the six-person groups were better 
(i.e., agreed more with expert opinion) 
at judging the quality of English com- 
positions, Ziller (1957) presented 
groups of two to six Air Force officers 
with two types of task: he found a 
positive relationship between group size 
and the quality of the group’s judgment 
concerning the importance of certain 
facts for making military decisions; 
however, there was a curvilinear rela- 
tionship between size and the accuracy 

of the group’s judgment of the number 
of dots on a card with four- and five-man 
groups doing less well than two-, three-, 
and six-man groups. Perlmutter’s study 
(1953) involved group memory rather 
than judgment: he found that three- 
person groups had somewhat better im- 
mediate recall for a story (The War of 
the Ghosts) than did two-person groups, 
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but the difference was not statistically 
significant. 

Speed of group performance was ob- 
served in three of the above studies 
(Perlmutter, 1953; South, 1927; Taylor 
& Faust, 1952) as well as in the study 
by Kidd (1958). In each case, speed 
was measured in terms of the amount 
of time required for the group to com- 
plete the task. Group size did not 
influence speed in the case of four prob- 
lem solving tasks or the memory task. 
However, in the two judgmental tasks 
used by South, three-person groups were 
faster than six-person groups. South 
suggested that the judgmental tasks 
required the group to reach a compro- 
mise; to the extent that more discus- 
sion is needed in order to reconcile a 
wider variety of initial opinions, this 
would account for the fact that the 
larger groups took longer to reach a 
decision. 

The efficiency with which the group 
solves a problem was considered in con- 
nection with two tasks we have desig- 
nated as “concept attainment" tasks. 
According to Taylor and Faust, two- 
person groups were more efficient than 
four-person groups since they expended 
fewer man minutes of “labor.” However, 
this is not really a new finding, since it 
is mathematically implied by the fact 
that there was no difference in speed as 
a function of size, Another meaning of 
problem solving efficiency involves the 
amount of intellectual effort expended, 
which in this case is measured by the 
number of questions asked by the group 
before they reached a solution, This 
measure of efficiency failed to differenti- 
ate between two- and four-person groups 
in the Taylor and Faust study, but 
South found that six-person groups were 
more efficient than three-person groups. 

Group productivity, defined as the 
number of correct units produced in a 
given time period, was examined in three 
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experiments. Comparing groups of Sizes 
3—10 Watson (1928) found a correlation 
of .65 between size and the number of 
different correct words created in an 
anagrams task. With a similar task, un- 
scrambling sentences, Kidd found no dif- 
ferences in productivity between groups 
of two, four, and six. Gibb (1951) 
briefly reported a comparison between 
individuals and groups of Sizes 2, 3, 6, 
12, 24, 48, and 96 and reported that the 
number of suggested solutions to a 
complex problem increased as a nega- 
tively accelerated function of group size. 
Considering the group performance 
findings as a whole, it appears that both 
quality of performance and group pro- 
ductivity were positively correlated with 
group size under some conditions, and 
under no conditions were smaller groups 
superior. In contrast, measures of speed 
showed no difference or else favored the 
smaller groups. The heterogeneity of 
tasks and of measurement procedures 
prevent more precise generalizations. 


Distribution of Participation 


Four studies were focused on the rela- 
tive degree of participation of each 
member. In one of the earliest investi- 
gations in this area Dawe (1934) kept 
a record of the number of remarks made 
by each child in kindergarten classes 
ranging in size from 14 to 46. The 
total number of remarks decreased with 
increasing size, but not to a statistically 
significant degree. While an increase 
in the size of the group was accompanied 
by an increase in the total number of 
children who spoke (r=.82), there was 
a decrease in the proportion of the 
group who spoke (r=—.58). Dawe re- 
ported also that the members who were 
seated toward the front of the room 
tended to speak more often than those 
seated further back, indicating that a 
spatial factor may be one determinant 
of the relationship between size and 
participation. 


Using Bales’ categories, Bales, Strodt- 
beck, Mills, and Roseborough (1951) 
observed interaction in leaderless discus- 
sion groups ranging in size from three to 
eight members. As size increased there 
was an increase in the relative discrep- 
ancy between the percentage of partici- 
pation for the person ranked first and 
that for the person ranked second and 
a reduction in the difference between the 
percentage of participation for the per- 
son ranked second and for all those 
with less participation. The authors at- 
tempted to fit a harmonic function to 
these curves, but with no success. Later 
Stephan (1952) was able to fit an ex- 
ponential curve more successfully to the 
same data. 

Stephan and Mishler (1952) con- 
ducted an experiment to assess the gen- 
erality of their exponential model be- 
yond the type of group and method of 
gathering data used in the study by 
Bales et al. (1951). The unit of par- 
ticipation was all the verbal behavior 
exhibited by an individual between the 
time the previous speaker finished and 
the next began—a much larger unit 
than was used in the earlier study. Th 
groups ranging in size from 4 to 12 mem- 
bers, they obtained results essentially 
the same as those of Bales et al. 

The relationship between group size 
and the distribution of acts was analyzed 
by Miller (1951) for groups of Sizes 
3-10, 12, 14, 16, 18, and 20. Although 
Miller used the same unit of participa- 
tion as Stephan and Mishler, the task 
was different: the game Twenty Ques- 
tions. As in Dawe’s study, it was found 
that the average number of participa- 
tions per member decreased as size in- 
creased (r=—.80). This of course is 
what would be expected when the length 
of the discussion period and the rate of 
participation are both held constant: 
as the group increases in size there is 
less opportunity for any individual to 
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speak. However, this reduction in op- 
portunity to speak did not seem to be 
accompanied by decreasing participation 
of the members not ranked highest, for 
there was found a small, nonsignificant 
negative correlation between group size 
and the number of persons who deviated 
from their expected percentage of par- 
ticipation based on equal distribution. 
Thus Miller's study casts doubt on the 
generality of the findings of Bales et al. 
and Stephan and Mishler and of the 
models developed to depict the results. 


Nature oj Interaction 


Four additional studies provided in- 
formation about the qualitative char- 
acteristics of group interaction. To 
study the "interaction profile" of the 
acts in the 12 categories of Bales' 
scheme, Bales and Borgatta (1955) ob- 
served discussion groups ranging in size 
from two to seven. The raw number of 
acts in each category was made a per- 
centage of the total number of acts in 
the 12 categories and then converted by 
an arc-sine transformation. Analysis 
indicated that as size increased, there 
was an increase in the categories of 
showing tension release and giving sug- 
gestions; and a decrease in categories of 
showing tension, showing agreement, 
and asking for opinion. In addition, 
two-person groups appeared to have 
certain unique properties, namely, a high 
rate in the category of showing tension 
coupled with low rates in the cate- 
gories of showing disagreement and 
showing antagonism. An odd-even ef- 
fect was also apparent: groups of 
four and six showed higher rates than 
did groups of three, five, and seven in 
the categories of showing disagreement 
and showing antagonism, but lower rates 
in the category of asking for sugges- 
tions. These findings must be inter- 
preted as representing relative increases 
or decreases in an interaction category. 
Bales and Borgatta offer intriguing 
speculations to interpret the trends, but 
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it is difficult to draw conclusions about 
what the critical changes may be as size 
increases from two to seven because a 
relative increase in one category may 
come about due to either an absolute 
increase for the category or an absolute 
decrease for all other categories, or both. 

Bales and Borgatta also analyzed 
variability over four sessions for indi- 
viduals and groups. They found in- 
creased variability for each individual's 
performance as size increased. The 
authors claimed that scores may have 
been less reliable in the larger groups be- 
cause of partitioning scores among à 
greater number of persons. Another in- 
terpretation given by these researchers 
is that there may have been more shift- 
ing of roles in larger groups because 
there were more persons to perform role 
functions, Trends of variability among 
individuals, of given groups over succes- 
sive sessions and among groups, revealed 
no clear-cut size effects. 

Slater (1958) used Bales’ categories 
with groups ranging from two to seven 
and concluded that there were inhibiting 
forces in the smallest groups which pre- 
vented the expression of dissatisfaction 
and disagreement. Scores on an Inhibi- 
tion Index (the ratio of the number of 
acts in four agreement categories to the 
number of acts in five disagreement 
categories) were significantly higher in 
groups of two, three, and four than in 
groups of five, six, and seven. Slater’s 
explanation was that the consequences 
of alienating a single member may have 
been more severe in the smaller than in 
the larger groups. 

A study by Berkowitz (1958) of 
groups of 3, 4, 6, 7, 9, and 10 members 
revealed that there was more disagree- 
ment in solving logical problems in the 
larger groups than in the smaller groups 
—a finding consistent with those of 
Slater and Bales and Borgatta. 

Bass and Norton (1951) observed 
leaderless discussion groups of 2, 4, 6, 
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8, and 12 members in which each mem- 
ber was rated on nine aspects of leader- 
ship behavior. As group size increased, 
the average leadership rating assigned to 
group members decreased significantly 
— group size accounting for 83% of the 
variance. There was also a nonsignifi- 
cant tendency for the within-group vari- 
ance of ratings to increase with size. The 
authors concluded that opportunity to 
adopt leadership functions decreased di- 
rectly with an increase in group size. 
Tentatively it would appear that 
smaller groups inhibit expression of dis- 
agreements and dissatisfactions more 
than larger groups and give each indi- 
vidual more opportunity to interact and 
to exhibit leadership behavior. The pos- 
sibility that there are unique, odd-even 
and near-linear size effects on the nature 
of interaction, as Bales and Borgatta 
suggest, should be pursued further, pos- 
sibly using different methods from the 
ones employed by these researchers. 


Group Organization 


The last set of studies related to ef- 
fects on the group as a whole concerns 
group organization. In Berkowitz’s study 
of the social organization of problem 
solving groups, 20 variables were found 
to be significantly affected by group 
size and most of these were reduced by 
a cluster analysis to two relatively inde- 
pendent sets of variables. The first 
cluster was called group cohesion, con- 
sisting mainly of sociometric variables 
such as the number of friendship choices. 
The components of this cluster were 
negatively related to size. The second 
cluster consisted of variables reflecting 
organization and division of labor. The 
variables in this cluster and their rela- 
tionship to size follow: as size increased, 
there was decreasing contribution of the 
least active member, higher member 
variation of interaction, higher variation 
in the number of rules for solving the 
problem suggested by members and sub- 
sequently adopted by the group, increas- 
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ing number of leaders, higher variation 
of rules suggested by each man, increas- 
ing number of votes required to reach 
decisions, increasing suggestions con- 
veyed by those who did not originally 
make the suggestion, and higher speciali- 
zation in the use of rules to solve the 
problems. 

In a study of factors affecting con- 
sensus in decision making groups, Hare 
(1952) found a greater tendency for 
groups of 12 than for groups of 5 to 
break into factions or cliques. This trend 
was not significant, however, and was 
based on the reports of group members. 
Probably more reliable are the findings 
of Miller who observed directly the 
frequency with which two or more mem- 
bers of a group talked or whispered 
among themselves rather than to the 
group as a whole. A significant correla- 
tion of .77 was found between the num- 
ber of members (3-10, 12, 14, 16, 18, 
and 20) and the number of cliques. Also, 
Miller found that this increase in cliques 
was associated with a decrease in group 
cohesiveness; a Cohesiveness Index cor- 
related —.52 with group size and —.60 
with the number of cliques. 

After familiarizing subjects with the 
concept of the primary group, Fisher 
(1953) asked them to describe the pri- 
mary groups to which they belonged and 
rank them for intimacy. Within the size 
range of 2-12 members, smaller groups 
were ranked as significantly more inti- 
mate. 

Taken together, these studies indicate 
that as size increases there will be de- 
creasing group cohesiveness and increas- 
ing organization and division of labor 
in the group, along with the development 
of cliques and possibly of factions. 


Errects ON MEMBER BEHAVIOR 


Individual Performance 


The effect of group size on individual 
performance has been considered pri- 
marily in connection with practical prob- 
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lems. In education, for example, there is 
a long history of concern with class size 
as a possible influence on classroom 
learning. We will not attempt to review 
this literature here, since it has been 
covered by Hudelson (1928), Von 
Borgersrode (1941), Goodlad (1960), 
and McKeachie (1963). Hudelson re- 
ported on 59 controlled experiments of 
the effects of class size, 46 of which 
favored large over small classes. In Von 
Borgersrode’s review 73 studies were ap- 
praised, 19 of which were classified as 
semicontrolled and 24 as controlled; his 
conclusion was that “On the whole the 
statistical findings definitely favor the 
large classes at every level of instruction 
except kindergarten” (p. 199). In his 
review of the effects of lecture size at 
the college level, McKeachie noted that 
more recent studies have not been as 
complimentary to large lectures as were 
the earlier studies done during the 1920s, 
Nevertheless, he concluded that large 
lectures were not generally inferior. 
Lorge et al. (1958) made reference in 
their review of together-alone studies to 
some of the literature on classroom learn- 
ing and concluded that there was prob- 
ably an interaction between class size, 
teaching methods, and study methods as 
determinants of educational effective- 
ness. But even so, the consensus of the 
reviewers was that large classes are either 
superior to small classes or at least not 
inferior, 
A second area of practical concern is 
the productivity of individual workers 
as a function of the size of work groups. 
Marriot (1949) found that the amount 
produced by male workers in a British 
factory, as measured by average piece- 
work earnings per man, declined signifi- 
cantly as the size of the work group in- 
creased from 10 to 50 members. In 
contrast, Gekoski (1952) found a non- 
significant positive relation between in- 
dividual productivity and size of work 
group (4-19 members) among female 


clerical workers in an American insur- 
ance company. Because these studies 
differed simultaneously along several 
dimensions, we have no basis for spec- 
ulation as to the specific conditions 
which determine whether group size will 
be positively or negatively correlated 
with individual productivity. 

Two experiments dealt with the indi- 
vidual’s improvement in problem solving 
as a result of his interacting in groups of 
varying size. Taylor and Faust (1952) 
found that practice in playing Twenty 
Questions enhanced the individual’s abil- 
ity to solve such problems, but it did 
not matter whether the practice was ob- 
tained alone, as a member of a two- 
person group, or as a member of a four- 
person group. By contrast, positive re- 
sults were reported by Utterback and 
Fotheringham (1958) in regard to the 
quality of solutions to human relations 
problems, Individual answers were re- 
corded both before and after a discussion 
in groups of 3, 6, 9, or 12 members. 
Improvement in quality of the indi- 
vidual’s solutions was significantly 
greater for the larger groups. However, 
there was also a significant interaction 
between group size and the manner in 
which the discussion was led: when the 
moderator intervened a great deal (‘full 
moderation”), individual improvement 
was greatest for 12-person groups; but 
when the moderator intervened very 
little (“partial moderation"), individual 
improvement was greatest for 3-person 
groups. Thus group size sometimes is 
related to individual problem solving, 
but the direction of the relationship is 
highly dependent on group conditions 
other than size, 


Conformity and Consensus 


In his famous experiment on yielding 
to a group of peers who, unknown to the 
naive subject, had been instructed to 
report unanimous but incorrect judg- 
ments of the length of visually presented 
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lines, Asch (1953) manipulated the size 
of the unanimous opposition. He found 
that as the number of confederates in- 
creased from 1 to 3, the amount of 
yielding to their unanimous judgment 
increased significantly; but there was no 
further increase in conformity as the 
confederates increased in number to 4, 
8, or 16. In two related studies, Gold- 
berg (1954) and Kidd (1958) failed 
to find any effect of group size even 
though their manipulations of group 
pressure did succeed in producing a sig- 
nificant amount of conformity. Gold- 
berg’s subjects, in groups of 2 or 4, made 
individual judgments of the intelligence 
of persons from their photographs; while 
Kidd's subjects, in groups of 2, 4, or 6 
made individual judgments of flicker 
frequency. In both experiments, each 
individual made a second judgment after 
being exposed to false feedback concern- 
ing the group's average first judgment. 
Conformity, as measured by shifting 
toward the bogus group average, 0C- 
curred in all groups, but the amount of 
yielding was not related to group size in 
either experiment. In contrast, Kishida 
(1956) found a significant size effect 
using groups of 5, 10, and 30 Japanese 
university students. Subjects responded 
individually to an opinion questionnaire, 
then received true feedback as to the 
majority opinion, and finally responded 
a second time to the questionnaire. Al- 
though there was a shift toward conform- 
ity in all groups, magnitude of opinion 
change showed a curvilinear relationship 
to group size, being greatest in 10-person 
groups and least in 5-person groups. 
The results of these four studies indicate 
that the magnitude of the group’s in- 
fluence on the individual is a function 
of group size under some conditions, 
but differences in task and procedure 
again preclude specification of the rele- 
vant conditions. 

A second set of findings involved 
measurement of the individual's opinion 
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both before and after group discussion 
of a problem. Half of Kishida’s (1956) 
groups discussed the opinion items and 
arrived at a group decision. Analysis 
of the individual postdiscussion re- 
sponses indicated that shift toward the 
group opinion bore the same curvilinear 
relationship to group size under discus- 
sion conditions as in the condition in- 
volving feedback of the majority opinion. 
A negative effect of group size was found 
by Hare (1952) with groups of 5 and 
12 Boy Scouts, with two measures show- 
ing that consensus increased more in the 
smaller groups. Finally, Utterback and 
Fotheringham (1958) reported a sig- 
nificant interaction between group size 
and the amount of intervention by the 
discussion moderator. Increase in con- 
sensus was greatest in 12-person groups 
under full moderation, but greatest in 
3-person groups under partial modera- 
tion. However, there was no main effect 
of group size in this study. These find- 
ings lend further weight to the conclu- 
sion that group size is an important 
factor in determining the amount of 
yielding to conformity pressures. 


Member Satisfaction 


Questionnaire measures of members’ 
subjective reactions to the group were 
included in four of the studies cited 
above. Both Hare (1952) and Slater 
(1958) found that members of larger 
groups were significantly less satisfied 
with the amount of time available for 
discussion, with their opportunity to 
participate, and with the group meeting 
or its decision. In addition, Slater found 
that his subjects considered five mem- 
bers to be optimum (ie., neither too 
large nor too small) for the task of dis- 
cussing a human relations problem. In 
contrast to the others, Ziler (1957) 
found no clear relationship between 
group size and members' satisfaction 
with their own part in the discussion, 
and Miller (1951) found no relationship 
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between group size and three measures 
of satisfaction. With this exception, the 
general trend of the findings indicates 
that the smaller the group, the more 
likely it is that the individual will be 
satisfied with the discussion and with 
his own part in it. McKeachie's (1963) 
review of studies on discussion groups at 
the college level also indicated that larger 
groups were less satisfying to both 
students and instructors. It must be re- 
membered, however, that the studies re- 
ferred to here all dealt with discussion 
groups attempting to solve particular 
problems, and that the generalization 
may not apply to other types of groups. 


EVALUATION 


More converging findings emerged 
from this review than one might have 
expected considering that only 31 
empirical investigations of group size 
were found to be relevant. But many 
more studies will have to be conducted 
and appraised before general conclusions 
can be drawn with confidence about 
the numerous effects of group size. Aside 
from their relatively small number, this 
set of studies, when considered collec- 
tively, has other limitations as a basis 
for generalized inferences. These short- 
comings and problems are discussed be- 
low. 


Methodological Difficulties 


The arbitrary and unsystematic selec- 
tion of sizes for comparison is perhaps 
one of the most serious methodological 
shortcomings of this group of studies, 
The pertinent range of sizes in order to 
generalize safely about small groups 
would seem to be 2 through at least 20. 
Two biases in drawing samples from this 
range are common. One is to sample a 
truncated series of the range, such as 
Sizes 2-4, and the other is to draw 
samples omitting various adjacent sizes 
so that there is an overrepresentation of 
odd or even sizes. Such biases may ob- 
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scure the true functional relationship be- 
tween size and the dependent variable. 
There is no reason to expect only a 
single type of function, such as a linear 
relationship, to hold between size and 
any dependent variable: in the studies 
reviewed here it was not uncommon to 
find a curvilinear relationship (cf. Asch, 
1953; Gibb, 1951; Kishida, 1956; Ziller, 
1957). From information about a limited 
portion of the size range, it is of course 
impossible to extrapolate with confidence 
to the type of functional relationship 
characterizing the entire range. As Bor- 
gatta and Cottrell (1957) have noted, 
the effects of size may be quantitative 
or discrete, The apparent linear effects, 
the odd-even differences, and the unique- 
ness of Size 2 would not have been noted 
by Bales and Borgatta had they sampled 
a more limited series of sizes. 

A second limitation of this group of 
studies is that many independent varia- 
bles other than group size were involved. 
The most conspicuous of these were the 
population characteristics of the sub- 
jects. In many cases, the observed groups 
were composed of either all male or all 
female members; in other cases, the 
groups were mixed; and a few investiga- 
tors failed to report the sex of their 
subjects. The age and education of 
subjects were other population charac- 
teristics with respect to which the studies 
differed. The tasks employed were 
similarly heterogeneous, and like the 
population characteristics, could have 
interacted with group size to produce the 
obtained results, In general, these studies 
provide very little information concern- 
ing the conditions under which group 
size is related in some specific way to 
some particular variable. 

Another shortcoming of this group of 
studies is the failure of the majority of 
the investigators to seek to determine 
why changes in group size had the ob- 
served effects. Rather than focusing on 
causal analysis, most of the researchers 
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have used what might be called a “cor- 
relational" approach, which generally be- 
gins and ends with an empirical concern 
with the relationship between group size 
and a given dependent variable. It is as 
if many of the studies proceeded on the 
assumption that size should somehow 
have an immediate effect upon some 
aspect of behavior, and, as a conse- 
quence, relevant intervening variables 
were almost never measured or varied 
experimentally. 

A basically different approach to the 
study of group size follows from the view 
that the size variable will have no be- 
havioral effects when stripped of various 
social and psychological accompani- 
ments, It is apparent, for example, that 
if there is no interdependence or com- 
munication between members, the size 
of the group is irrelevant to the predic- 
tion of behavioral changes for the mem- 
bers. An analysis and design oriented 
toward isolating the critical intervening 
variables in the relationship between size 
and the dependent variables should be 
the objective of studies which aspire to 
understand why size has effects. 

Indik's (1961) study of the relation- 
ship between the size of voluntary or- 
ganizations and the tendency of mem- 
bers to participate, clearly illustrates the 
importance of establishing a procedure 
to search for, and examine, the effects of 
possible intervening variables. Although 
voluntary organizations ranging in mem- 
bership from 15 to 2,983 were studied, 
the method illustrated is fully applicable 
to the small group. It was hypothesized 
that the negative relationship between 
size of organization and tendency of 
members to participate (as indicated by 
absenteeism in a service company and 
attendance in a women's association) is 
mediated by organizational variables 
(e.g., amount of communication, job and 
task specialization, higher level inter- 
personal control and coordination) and 


psychological variables (e.g., attraction 
of members to the organization, satisfac- 
tion with one's activities performed in 
the organization, and perceived bureau- 
cratic inflexibility). Four explanatory 
schemas were proposed in each of which 
size was linked to an organizational and 
a psychological intervening variable. The 
schema which found the strongest sup- 
port was the one which hypothesized that 
the size of the organization would be 
negatively related to the amount of com- 
munication among members, that the 
amount of communication of each mem- 
ber would be positively associated with 
the attraction of members to the or- 
ganization, and that the attraction of the 
members to the organization would be 
positively associated with tendency to 
participate. Analyses were also made of 
the other schemas and from these it was 
concluded that various intervening varia- 
bles mediated the relationship between 
size and participation. This approach to 
the problem is distinctly more sophisti- 
cated than that of relating size to a 
given dependent variable without con- 
sidering possible intervening conditions. 
However, Indik did not let the problem 
rest with the results reported above, for 
he ran higher order partial correlations, 
holding constant pairs of intervening 
variables, to see what effects this would 
have on the magnitude of the original 
correlation of —.53 between size and 
tendency to participate. When the com- 
bined effects of the specialization-satis- 
faction linkage were removed the cor- 
relation between size of organization and 
tendency to participate dropped from 
—,53 to —.26; when the effects of lack 
of coordination and perceived bureau- 
cratic inflexibility were removed, the 
correlation dropped further to —47; 
and finally, when the communication- 
attraction linkage was removed, the par- 
tial correlation was left at —.08. 
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Dependable Intervening Variables 


If group size is phenotypic and really 
but a correlate of the social and psycho- 
logical conditions capable of producing 
changes in member and group behavior, 
it would be fruitful to stipulate what 
these intervening variables are. Can we 
say that some intervening variables will 
always or most always be influenced by 
variations in group size? If there are 
such variables and if these variables also 
produce relevant outcomes, then hy- 
potheses predicating such outcomes as 
a function of size should be more or less 
axiomatically correct, Various classes of 
such variables are discussed below, along 
with examples and probable effects. 

Input Quantity. Two types of input 
may be assumed to increase with group 
size. The first is resource input, such as 
interaction skills, knowledge, capacities, 
and physical strength. If we designate 
the amount of any given resource 
brought to the group by an individual 
as r, and the total amount of the re- 
sources of all group members made use 
of in the group as R, then R=/(r, n) 
where » equals group size. For some re- 
sources each individual will contribute 
approximately equal amounts in which 
case R=rXn; for other resources the 
amounts will vary widely among indi- 
viduals and the functional relationship 
between r and n will be more complex. 
Whether R is greater than, less than, or 
equal to the sum of the individual re- 


sources ( 3 ri) depends upon facilitat- 


ing and hindering group conditions. But 
in general, R increases with increasing 
group size, whether or not the relation- 
ship is linear. 

The relevance of viewing resource in- 
put as a dependable intervening variable 
is clear in the case of group problem 
solving. If knowledge is needed to solve 
a problem and each individual brings 
particular knowledge that others do not 


have, with increasing size there will be 
greater likelihood that the total amount 
of knowledge pertinent for solving the 
problem will be available in the group. 
If intelligence is the resource required 
for solving the problem, then the addi- 
tion of each individual increases the 
likelihood that someone will have the 
capacity to solve the problem. Assuming 
that problem difficulty is known, the 
application of elementary probability 
theory to this problem readily makes 
possible the writing of formulas to ex- 
press the likelihood of intellectual ca- 
pacities being brought to the group with 
the addition of each individual. Thus 
one may state the likelihood that there 
will be at least one member in the 
group with the correct answer, that there 
will be more than half with the correct 
answer, that everyone will be correct, 
that everyone will be incorrect, or that 
there will be mixed correct and incorrect 
answers. If p equals the probability of 
the success of the individual working 
alone on the problem, z is the number 
of individuals in the group and q equals 
1— 5, then the probability that there will 
be at least one person in the group arriv- 
ing at the right answer equals 1— 
(1—p)". Taylor (1954), Ekman 1955), 
and Lorge and Solomon (1955) have all 
applied this basic idea to group problem 
solving where it is useful to compare 
results against theoretical probabilities 
based on the assumption that if one 
member gets the correct answer the 
group will accept it. 

The formula given above applies to 
problems having one stage in their solu- 
tion, but may be extended to multistage 
problems, as Lorge and Solomon have 
done. Their formula for the probability 
of a group solving a multistage problem 
is the product of the probabilities that 
the group can solve the problem for each 
Stage. Steiner and Rajaratnam (1961) 
have further extended the basic idea by 
developing formulas, applicable to in- 
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terval data, which make it possible to 
test the null hypothesis that groups func- 
tion at the level of the most competent 
member, of the least competent member, 
or at any level of competence. In con- 
sidering group solutions that can be 
represented as distributions of individual 
answers, Thomas and Fink (1961) have 
elaborated three models (independence, 
rational, and consensus) which state 
varying antecedent group conditions and 
the associated theoretical probabilities of 
having everyone correct, some correct 
and incorrect, and everyone incorrect for 
problems involving two-alternative and 
multialternative solutions. 

A second type of input that increases 
with size is demand input. The demands 
each individual brings to the group in- 
clude the need for recognition, affection, 
a minimum amount of social interaction, 
and so on. As size increases the sum of 
these individual demands increases. One 
of the main implications of an increase 
in demand input with increasing size is 
that larger groups will need more re- 
sources than smaller groups in order to 
meet the demand. Furthermore, we may 
assume that to the extent that larger 
groups are unable to satisfy demands of 
the members there will. be decreased 
attractiveness, lack of reward, resulting 
in dissatisfaction, lowered morale, and 
a tendency to leave the group. If we 
assume that each person has some desire 
to communicate in group discussions, 
then restricting the amount of time to 
discuss a problem, as Hare (1952) did, 
will result in greater demand on time in 
the larger groups, Hare’s finding that 
there was decreased satisfaction with 
the discussion in the larger groups as 
compared with the smaller groups 1S 
understandable in terms of the above 
comments. 

Increasing Sample Size. As the size of 
the group increases, there is obviously 
an increase in the size of the sample of 
individuals. One consequence of increas- 


ing sample size is that there is more 
accurate estimation of parameters of the 
parent population when the sample is 
taken from a nonhomogeneous universe 
and the individuals are not added to the 
group by a biased method. Parameter 
estimation is illustrated by the results 
of a study by Eysenck (1939) in which 
the rankings of a sample of 200 subjects 
concerning their preferences for 12 black 
and white pictures were compared with 
the criterion of the average rankings by 
700 subjects. It was found that as the 
number of unbiased judges increased, 
the correlation of their pooled rankings 
with a criterion increased. Sampling and 
test theory abound with other illustra- 
tions. In Ziller’s study, judgments of 
the number of dots on a card should have 
been more accurate with increasing 
group size in the same way that he found 
increasing correspondence between the 
judgment of military experts on human 
relations problems with the judgments 
of the group members as size increased. 
One can only conclude that the errors 
in judgment were not random in the 
problem of judging dot numerosity, or 
that biased samples of judges were 
drawn, or both. 

A second consequence of increasing 
sample size is that the heterogeneity of 
the variates sampled increases due to the 
inclusion of more variates from the ex- 
tremes of the distribution, Applied to 
social groups, it follows that with an in- 
crease in size there will be more varied 
talents, more individuals with requisite 
skills and knowledge for performing 
specialized tasks, and more individuals 
who are likely to be liabilities. The 
foregoing review revealed that as size 
increased there was increasing variability 
of individual performance and increas- 
ing organization and division of labor in 
the group; both effects are probably due 
in part to an increase in the hetero- 
geneity of the sample. J 

Potential Relational Complexity. Vari- 
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ous writers, among them Bossard (1944) 
and Kephart (1950), have pointed out 
that as group size increases arithmeti- 
cally there is a geometrical increase in 
the number of relational possibilities 
among the members. For example, the 
number of possible diadic relations in 
any group increases according to the 
familiar formula: (m?—n)/2. Potential 
relational complexity becomes important 
in groups because most individuals have 
a limited capacity to establish relation- 
ships with others. For example, Jennings 
(1960) has indicated that the average 
repertoire of choices for associating 
closely with others (working, etc.) is 
about 8 and that this number is increased 
to about 12 when leisure time activities 
are included. Individual relational ca- 
pacities limit the varieties of informal 
and formal organization and the degree 
of role differentiation that may develop. 
Thus an upper bound is automatically 
set on the actual relational complexity 
of a group by the limited capacity of 
individuals to establish particular rela- 
tionships and by the potential relational 
possibilities, 
Several studies reviewed here indicate 
that as group size increases there is a 
decrease in cohesiveness, along with the 
development of cliques and possibly of 
factions. These findings are understand- 
able generally when the limited choice 
capacities of individuals are placed 
against the potential relations that 
multiply with group size. First consider 
cohesiveness, for which we may assume 
that positive attraction among all mem- 
bers at some degree of intensity is gen- 
erally required. If we acknowledge that 
individuals have a limited capacity to 
like others (as the findings of Jennings, 
1960, suggest), and that the sociometric 
structure of a group must be filled up 
with a certain relatively high number of 
liking linkages in order for cohesiveness 
to exist, then as group size increases the 
potential relations will exceed the ca- 
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pacity of individuals to fill them, the 
average member attraction in the group 
will decrease, and consequently so will 
group cohesiveness. The development of 
cliques and factions should eventually 
occur, for as size increases a smaller pro- 
portion of the possible linkages will be 
made. 

Differences between diads and triads 
can be understood partially in terms of 
relational possibilities. As Bales and 
Borgatta (1955) have noted, in a two- 
person group there can be no majority 
other than by unanimity, whereas in a 
three-person group there can be a major- 
ity of two against a minority of one. 
In a three-person group there ere eight 
possible coalition possibilities (Caplow, 
1959) whereas in a diad the power rela- 
tions are much simpler: one member may 
have power over the other or they may 
be equal. 


Nondependable Intervening Variables 


Nondependable intervening variables 
are those which are affected by changes 
in group size only under certain condi- 
tions. Almost any social or psychological 
condition influencing group process or 
outcome may be a nondependable inter- 
vening variable. There are numerous 
examples: the time allowed for group 
discussion may be held constant while 
group size is increased, thereby reducing 
the opportunity to participate; a group 
payoff may be held constant in coopera- 
tive groups of different sizes, thus vary- 
ing the amount of reward that each 
member may possibly get when the pay- 
off is divided among them; expectancy 
of reward may decrease with increasing 
size of a competitive group; the relative 
contribution of each member in coopera- 
tive groups may decrease with size, 
thereby decreasing the individual’s sense 
of importance and worth in the group; 
the interdependence of the members may 
be high, thus increasing the likelihood 
that with increasing size there will be at 
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least one individual who will perform 
poorly and hinder the group's progress; 
the cost in time and money represented 
by each additional group member may 
eventually exceed the possible gain to be 
derived from having more individuals 
work on the problem; and the complex- 
ity of the cognitive field may become 
great with increasing size to the extent 
that members try to attend to impinging 
social stimuli. 

Most of the observed effects of size 
reviewed here appear to be contingent 
upon the operation of one or more of 
these nondependable intervening varia- 
bles. Where such mediating variables 
are affecting an outcome, the proper 
focus for explanation should be on the 
theoretical framework of which the inter- 
vening variable is a part and only 
secondarily on size. 


CONCLUSIONS 


On the basis of this review it is ap- 
parent that group size has significant 
effects on aspects of individual and 
group performance, on the nature of 
interaction and distribution of participa- 
tion of group members, on group or- 
ganization, on conformity and consensus, 
and on member satisfaction. This ap- 
praisal suggests that the variable of 
group size should be included in theories 
of group behavior, distinguishing where 
possible between the effects that result 
from the interaction of group size with 
other independent variables and the 
effects arising from intervening variables 
that are dependably and nondependably 
associated with size. 

It is concluded furthermore that fu- 
ture research on group size should pro- 
ceed systematically, making every effort 
(a) to vary size in complete sequence 
over a suitably large range; (b) to 
conceptualize, identify, and measure 
relevant intervening variables; (c) to 
determine in advance whether these 
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variables should be expected axiomati- 
cally to be correlated with size or would 
be only contingent variables; and (d) to 
use multivariate designs, where appro- 
priate, in which group size and signifi- 
cant intervening variables are both 
manipulated experimentally. 
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ON HOMOGENEOUS RETINAL STIMULATION AND 
THE PERCEPTION OF DEPTH 


THOMAS NATSOULAS ! 
Wesleyan University 


A review of recent experiments showing that where the perception of 
voluminous fog does not occur consistently under conditions attempting 
homogeneity of visual stimulation, there are sources of inhomogeneity 
which can produce the impression of a surface. As homogeneity is 
approached, the volume experience becomes more reliable. A view of 
this phenomenon, other than Gibson’s—which does not deal with it on 
the grounds of poor reliability—or Koffka's—which attributes it to the 
fundamental nature of the perceptual system, is presented. It is based 
on kinesthetic stimulation as a likely source of visual space anisotropy 


with respect to perceived distance. 


Recently Gibson (1959) argued per- 
suasively for a theory of “perception as 
a function of stimulation." A key asser- 
tion in the theory was that “The stimu- 
lus variables for vision must exclusively 
be found in a textured optical array, sup- 
plemented by the transformation relat- 
ing a simultaneous pair of them, and by 
the transformations relating a continu- 
ous sequence of momentary arrays" (p. 
474). As applied to the visual perception 
of depth, the theory requires that such 
experiences be produced by certain at- 
tributes of the textured optical array, 
especially variations in “density, dispar- 
ity, and motility” (p. 475). It seems 
reasonable to conclude from the theory 
that the perception of visual depth 
should not occur under conditions of 
homogeneous retinal stimulation. 

Yet there is evidence, discussed below, 
that visual depth is experienced when 
visual stimulation is not patterned. This 
fact leaves the theorist two alternatives: 
to state the theory in a way more con- 
sistent with experimental data; or to 
argue, on some grounds, that this type 
of visual experience, depth under homo- 
geneous conditions, does not fall within 
the present bounds of the theory. 

Gibson (1959; Gibson & Waddell, 
1952) has taken the latter course: There 


1 Now at the University of Wisconsin. 


exist perceptual phenomena which take 
place in response to “impoverished, am- 
biguous, or equivocal pattern stimula- 
tion” (Gibson, 1959, p. 466). These ex- 
periences are unreliable, varying across 
observers and from time to time in the 
observer. They are not considered im- 
mediately because they are a function of 
something other than stimulation; only 
perceptions for which a correspondence 
can be demonstrated with attributes of 
optical arrays serve initially as subject 
matter, Corollaries will be stated even- 
tually encompassing the excluded events. 
Among these events is the perception of 
visual depth under homogeneous retinal 
stimulation. 

To omit from consideration certain 
classes of data is a useful and unobjec- 
tionable strategy for the early stages of 
theoretical development, especially when 
criteria are specified. It is another matter 
to ask whether the particular omission 
meets the explicit standard, The stand- 
ard in the present instance seems to be 
variability, both within and between ob- 
servers. The following examination of 
the findings of recent experiments at- 
tempts to determine whether one such 
omission is justified. 

Gibson, Purdy, and Lawrence (1955) 
eliminated all inhomogeneities in stimu- 
lation beyond the circular first aperture 
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of an optical tunnel. The results from 
10 practiced observers were summarized 
in a few sentences: 


Words like filmy, translucent, soft, milky, 
hazy, or misty were generally applied to the 
luminous circular area. Sometimes this looked 
flat or two-dimensional, sometimes it looked 
deep, like *3-dimensional light," and for two 
Os it looked like a homogeneous convex 
sphere or disk which later became concave. 
Faint rings or circles were sometimes (but 
not always) apparent, and an impression of 
depth usually accompanied this, Something 
tunnel-like was often seen, but the reports 
were variable as between Os and from one 
moment to the next" (Gibson et al, 1955, 
p. 9). 

That an impression of depth can occur 
under homogeneous conditions, but that 
it is very variable is indicated by this 
experiment. By virtue of the lack of a 
demonstrated correspondence with a par- 
ticular perception or response, the pre- 
sented stimulus pattern falls nicely into 
the category of conditions temporarily 
to be neglected (Gibson, 1959). On the 
other hand, the conditions of stimulation 
were only partially homogeneous; there 
Was, of course, the gross inhomogeneity 
between the optical tunnel and the visi- 
ble piece of Vinylite plastic which bears 
the aperture. 

Gibson and Waddell (1952) referred 
to Katz's (1935, p. 89) observation that 
distant objects whose microstructure is 
not visible are experienced, nevertheless, 
as having a solid surface character be- 
cause of the “framing contours . . . 
and meaningful setting." Cohen (1957) 
demonstrated that the introduction of a 
small, homogeneous, circular spot (8 
centimeters in diameter at 1 meter 
distance) into a homogeneous field of 
different intensity or wave length yields 

an experienced recession of the previ- 
ously present, pervading fog. A gross in- 
homogeneity between two areas, there- 
fore, affects the way the homogeneous 
one is perceived. The two- and three- 
dimensional impressions of the tunnel 


area in the experiment of Gibson, Purdy, 
and Lawrence might have alternated de- 
pending on where the observer looked, a 
soft surface quality occurring when a 
point on the plastic sheet was fixated, 
and an experience of depth when the 
tunnel area was looked at directly. 

Using a hemispherical lighting fixture 
in which the observer placed his face 
and was stimulated by light shining 
through from the outside, Gibson and 
Dibble (1952) and Gibson and Waddell 
(1952) found that what is experienced 
is a "luminous fog" or "sea of light." 
However, they pointed out that the de- 
scriptions were not consistent; at times 
"something vaguely surface-like" was 
experienced. They ascribed the incon- 
sistency to facial temperature and audi- 
tory stimulation from breathing. Cohen 
(1957) suggested that facial shadows 
may have been responsible. 

In further work Gibson and Waddell 
used two translucent hemispheres made 
from a table tennis ball and fitted to each 
eye. Under high intensity of light stimu- 
lation the retinal images were "ap- 
proximately homogeneous." Hochberg, 
Triebel, and Seaman (1951), employing 
the same method, fastened eyelashes to 
eyelids to remove a conceivable remain- 
ing source of inhomogeneity, The grain 
of the translucent surfaces might have 
contributed further inhomogeneities, but 
neither article suggested this, although 
the mention of “approximate” homo- 
geneity may intend this factor. 

With regard to the question of whether 
Perception of depth occurs reliably under 
homogeneous retinal stimulation, the 
study of Gibson and Waddell (1952) 
leaves some doubt, Several, practiced 
observers “tended to agree that the ex- 
perience had some kind of voluminous or 
space-like quality and . . . the reports 
also agreed that the experience was in- 
definite, indeterminate, or ambiguous” 
(p. 267). This is a result to be expected 
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when conditions approximate homo- 
geneity; what is experienced will depend 
on where one is looking and whether the 
momentary and successive arrays are 
homogeneous. 

Because the practiced observers might 
have been prejudiced in their reports 
by knowledge of earlier findings, 13 rela- 
tively naive observers served in the ex- 
periment proper. Three representative 
reports were given by Gibson and Wad- 
dell. All of these, although not explicitly 
referring to depth, implied the perception 
of something with volume. *Fog coming 
up to my eyes,” “a white that you could 
see into," *same penetrability" as fog, 
and *might wander in it for hours" are 
phrases taken from each report. Terms 
such as curtain, wall, and screen were 
used but “less frequently.” The impres- 
sions varied between observers and 
within some observers from moment to 
moment. 

The variability in this experiment 
might be a result of two factors. 

1. The presence of gross inhomoge- 
neities in stimulation can lead to ex- 
tended effects such as the perception of 
a homogeneous area as surface rather 
than deep fog. Argument for this asser- 
tion is given above, in the discussion of 
the findings of Gibson, Purdy, and Law- 
rence (1955). One can add that Gibson 
and Dibble (1952) also have suggested 
this effect in another context. They 
wrote, “However, it is possible that the 
steep intensity gradient at the contour 
may have been the effective stimulus for 
the impression in this case rather than 
the texture within the contour.” In ad- 
dition to the already mentioned possible 
inhomogeneities in this experiment, 6 of 
13 observers reported an impression of a 
closed boundary, suggesting an addi- 
tional source, perhaps a change in in- 
tensity of illumination at those parts of 
the eyecups which constituted the peri- 
phery of the field of vision. 
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2. Another source of variable reports, 
suggested by Cohen (1957), may have 
been a consequence of the preliminary 
training session during which the ob- 
servers practiced and were given instruc- 
tion in the description of perceived 
object qualities. As Cohen put it, the ob- 
server then may “seek something other 
than fog.” Thus the observer was placed 
in an approximately homogeneous situa- 
tion set to respond to any inhomoge- 
neity. That reports of “fog, mist, haze, 
and cloud” exceeded those of “curtain, 
wall, and screen” is evidence for the 
tendency to have a volume experience 
even under conditions departing from 
optimal homogeneity. 

Hochberg, Triebel, and Seaman 
(1951), using a similar method of stimu- 
lation, took an additional precaution in 
attempting to equate intensity over the 
entire field. They mounted on each side 
of the headrest, reflectors of white card- 
board. (This may not have been neces- 
sary or useful in the Gibson and Waddell 
experiment, because the method of il- 
lumination involved reflected light from 
an entire wall and floor; Hochberg, 
Triebel, and Seaman sent light through 
the eyecups by means of a slide pro- 
jector.) The results show that only 1 of 
12 observers reported seeing a surface. 
The rest experienced a fog or cloud. 

Most recently Cohen (1957) used an 
apparatus comprised of intersecting 
spheres, each 1 meter in diameter. The 
internal surfaces of these hollow spheres 
were diffuse reflectors; they produced in- 
direct uniform illumination over the en- 
tire internal surface when a small part 
was illuminated directly. Even at high 
intensities, no texture was visible, When 
the two spheres were equally illuminated, 
the circular aperture (8 centimeters in 
diameter) between them, the circle of 
their intersection, was not visible as such. 
To eliminate inhomogeneities from face 
shadows, Cohen prepared for each ob- 
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server a mask which served as part of 
the wall of one sphere. An appropriate 
hole in the mask permitted monocular 
viewing from a point directly opposite 
the aperture between spheres. The 
spheres were illuminated by means of 
two beams of light passed into the far- 
ther sphere through two openings out 
of the observer's area of vision. One 
beam passed through the aperture be- 
tween spheres and was directed to a 
point 20 centimeters above the observer's 
eye. 

Sixteen relatively naive observers re- 
sponded to a condition of homogeneous 
visual stimulation. The most charac- 
teristic report was of a foglike experi- 
ence, Cohen contrasted the consistency 
of these reports with the variability 
shown by Gibson and Waddell (1952). 
Here the fog was seen “to extend an 
indefinite distance." Distance judgments 
were indeterminate. The introduction 
of a difference in intensity or wave length 
between the nearer sphere and the far- 
ther sphere as seen through the aperture 
resulted in reduced fog density and in- 
creased fog distance from the observer. 
Even when the intensity gradient was 
greatest (aperture area 2.3 times as 
bright), fog was reported in more than 
half the cases. 

One can conclude from this discussion 
of recent experiments as follows. 

1. In experiments where the percep- 
tion of voluminous fog does not occur 
consistently, there are sources of inho- 
mogeneity which can produce the im- 
pression of a surface. 

2. As homogeneity is approached, the 
volume experience becomes more re- 
liable. 

3. If more intensive experimentation 
bears out the above conclusions, it will 

be necessary to revise Gibson's theory 
(1959) so that the reliable phenomenon 
of depth under conditions of homogene- 
ous retinal stimulation can be handled. 
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Another theorist, Koffka (1935), has 
accepted depth experience under ho- 
mogeneous conditions as fact on the basis 
of Metzger's (1930) early experiments, 
To produce a homogeneous field, Metz- 
ger used as reflector surface a white- 
washed wall with wings at the four sides. 
The observer sat before the wall at a 
distance and in an orientation such that 
his entire visual field was comprised of 
stimulation from the wall and wings. 
Only under conditions of relatively low 
illumination was Metzger successful in 
producing a homogeneous retinal field. 
At brighter levels the space filling fog 
receded and was transformed into a sur- 
face at a distance. This change with in- 
tensity was said to result from inho- 
mogeneities of stimulation produced by 
the microstructure of the wall. That an 
increase in the inhomogeneity of effective 
stimulation was responsible is made 
credible by Cohen’s results (1957). They 
showed a recession of fog as the degree of 
inhomogeneity between circular aper- 
ture and internal surface of the sphere 
increased. 

From Metzger’s results, Koffka 
(1935), drew a “fundamental principle 
of psychophysical organization” which 
states: “Under the simplest possible 
conditions of stimulation our perception 
is three-dimensional; we see space filled 
with neutral colour stretching into a 
more or less indeterminate distance" (p. 
115). This effect is attributed by Koffka 
to the nature of the perceptual system. 

A view of the phenomenon, other than 
Koffka's and Gibson's interpretations, is 
suggested by the presence in experience 
of inhomogeneity under visual stimulus- 
conditions which are homogeneous. Two 
types of experienced inhomogeneity can 
be distinguished under the following 
headings. 

1. Relatively localized: Close scrutiny 
by the observer of the visual field under 
homogeneous conditions of stimulation, 
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it is claimed (Koffka, 1935), results in 
“points of light and cloudlike structures 
shifting through his field” (p. 120). 
Cohen (1957) observed, under presum- 
ably homogeneous conditions, possibly 
related phenomena: a “cracked ice ef- 
fect” and a “weblike structure.” These 
could be eliminated through minute ad- 
justments of the brightness of the circu- 
lar aperture. Cohen tentatively ascribed 
the effect to minimal intensity differences 
between sphere and aperture, insufficient 
to produce more localized articulations 
` in experience. Hochberg, Triebel, and 
Seaman (1951) reported that some ob- 
servers experienced “hallucinatory ob- 
jects and patterns.” Gibson and Waddell 
(1952) reported inhomogeneous experi- 
ences which they attributed to shadows 
on the retina from sources in or on the 
eye. These seemed to move with the eye. 

2. Pervasive: The pervasive effect of 
tridimensionality is an instance of in- 
homogeneity in experience present under 
conditions of homogeneous stimulation. 
An analysis of what is meant by perceiv- 
ing depth leads to statements such as 
(a) under conditions of fixated, mono- 
cular viewing, parts of the optical array 
are experienced as more distant than 
other parts; or (5) under conditions of 
free viewing, looking in one direction 
results in an experience of an optical 
array which seems less distant than the 
one experienced when looking in another 
direction. Ordinarily such effects are at- 
tributed primarily to differential stimu- 
lation provided within or between optical 
` arrays (Gibson, 1959). However, when 
differential stimulation is absent, such 
spatial anisotropies must be ascribed to 
differential functioning of the perceptual 
system. 

In discussing the presence of direc- 
tional gradients in visual space, i.e., 
visual space's having an up and a down, 
a right and a left, Köhler (1940) traced 
the development of this structure to ex- 


389 


travisual or kinesthetic sources to the 
“structure of motor space" (p. 17). A 
result of this structuring of visual space, 
according to Kóhler, is that a figure can 
be experienced as different depending on 
its locus and orientation. In considering 
the changes in perception which result 
from reversing the position of objects in 
pictures, Gaffron (1956) speculated in 
a similar way: 

Thus the phenomena on right-left reversal 
need not be caused by any laterality in pri- 
mary visual sensory processes but rather by 
their integration with visual motor and pro- 
prioceptive processes for which a primary 
laterality appears to exist (p. 286). 

Studies of the anisotropy of visual 
space with respect to depth have been 
conducted with well differentiated figures 
(Adair & Bartley, 1958; Gaffron, 1950). 
These studies have demonstrated that 
objects are experienced as closer to the 
observer when they are at the left in a 
picture than at the right. The reversal 
in position is accomplished by present- 
ing the mirror-image of the picture. 
Adair and Bartley (1958) have used an 
indirect method for showing the effect. 
The observer was required to adjust the 
distance of a drawing which was twice 
the size of the standard until it was ex- 
perienced as being an equal distance 
away. On some trials the variable was 
identical to the standard; on other trials 
it was the standard's mirror-image. Gaf- 
fron (1950, 1956) appears to have pre- 
sented the observer with paintings and 
their mirror-images and asked about the 
apparent nearness of objects within the 
paintings. A number of convincing 
demonstrations that the reader is asked 
to look at himself are provided as well 
(Graffron, 1950). To what extent these 
results will be generalized to less struc- 
tured, optical arrays remains to be seen. 
Fisher (1961) used the direction of auto- 
kinetic movement as a measure of right- 
left anisotropy on the grounds that “the 
autokinetic situation is so completely un- 
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structured and therefore maximizes the 
possibilities for a person to impose his 
own biases upon the stimulus" (p. 64). 

The proposed view for the perception 
of depth under homogeneous retinal 
stimulation can be summarized as fol- 
lows. The perception of depth must in- 
volve differential reactions of perceived 
distance to aspects of a single optical 
array or to successive optical arrays. 
When stimulating conditions cannot be 
responsible for such differential reac- 
tions, it is necessary to attribute the 
effect to differential functioning of the 
perceptual system brought on by other 
factors. A likely source of space aniso- 
tropy with respect to visual distance is 
kinesthetic stimulation. The latter re- 
sults in the visual impression that stimu- 
lation coming from one part of the visual 
field has a nearer source than equivalent 
stimulation coming from another part. 
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A review and critique of research on form thresholds and their determi- 
nants. Among the determinants discussed are stimulus frequency, re- 
sponse suppression, response stereotypy, set, value, emotional tone, and 
differential electric shock. The problems of defining and measuring form 
thresholds are elaborated, with particular reference to studies of sub- 
liminal perception and differential threshold. The importance of the 
growing interest in separating response parameters from perceptual ones 


is pointed out. 


DEFINITIONS 


Discussion of the form threshold liter- 
ature is complicated by problems of defi- 
nition. The term “recognition,” for ex- 
ample, has been variously defined (cf. 
Binder, 1955; George, 1952), as has the 
term “discrimination” (cf. Arnoult, 
1954; Casperson, 1950). This article will 
follow a terminology based upon the 
response required of the subject (Hake, 
1957). Recognition will refer to tasks 
in which the subject indicates whether 
or not he has seen the stimulus before, 
discrimination to tasks in which the sub- 
ject indicates whether or not the stimulus 
differs from one or more other stimuli, 
identification to tasks in which the sub- 
ject indicates the name of the stimulus, 
detection to tasks in which the subject 
indicates whether or not a stimulus is 
present, and judgment to tasks in which 
the subject indicates the position of the 
stimulus on a scale. 


THE OLD AND THE NEw Loox 


Prior to the late 1940s, when the 
New Look perceptual experiments began 
to appear in psychological literature, 
form perception was generally left to 
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Old Look Gestalt theorists, who empha- 
sized the role of the stimulus (Gott- 
schaldt, 1926; Wertheimer, 1923). Little 
attention was paid to the observer as a 
parameter, and the influence of experi- 
ence or learning was explicitly denied 
(Koffka, 1935). 

New Look experimenters, on the other 
hand, focused their attention upon the 
observer and his inner states as de- 
terminants of perception. Where the 
Old Look laws of form perception could 
be illustrated with pictures of stimuli, 
New Look principles required anecdotes 
(Osgood, 1953, cf. pp. 216, 286). 

Bruner and Postman (1947) found 
that words to which subjects showed long 
latencies in a word association test had 
low identification thresholds (perceptual 
vigilance). Postman, Bruner, and Mc- 
Ginnies (1948) and Vanderplas and 
Blake (1949), using the Allport-Vernon 
scale of values, found that words related 
to the subjects’ highest value areas had 
lower identification thresholds than 
neutral words (sensitization), while 
words related to subjects’ lowest value 
areas had higher thresholds than neutral 
words (perceptual defense). McGinnies 
(1949) provided another demonstration 
of perceptual defense in his well-known 
“dirty word” experiment, where thresh- 
olds for taboo words (PENIS, WHORE) 
were shown to be higher than thresholds 
for neutral words (MUSIC, STOVE). 

In the early 1950s the New Look 
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studies were virtually inundated by 
what might be called the First Wave of 
criticism. The essential feature of this 
First Wave was an emphasis upon more 
general threshold determinants, such as 
word frequency. 


Worp FREQUENCY 


It was argued (Howes & Solomon, 
1950) that taboo words are low in 
Thorndike-Lorge (1944) frequency and 
that this alone could account for their 
high thresholds in the McGinnies experi- 
ment. McGinnies (1950) retorted that 
undergraduate subjects’ usage of taboo 
words is not at all what one would infer 
from The Teacher's Word Book (Thorn- 
dike & Lorge, 1944). Howes and Solo- 
mon (1951) buttressed their critique of 
McGinnies’ study by showing visual 
duration thresholds for a number of 
words to be an approximately linear, de- 
creasing function of Thorndike-Lorge 
frequency. 

Solomon and Howes ( 1951) and 
Postman and Schneider (1951) did si- 
multaneous experiments to determine the 
importance of this word frequency varia- 
ble for the results found by Postman, 
Bruner, and McGinnies (1948) with the 
Allport-Vernon scale of values. Both 
sets of experimenters dichotomized the 
words for each value area into a frequent 
and an infrequent group and presented 
three-way analyses of variance on the 
threshold data (subjects by value rank 
by word frequency). Both found sig- 
nificant main effects for subjects and 
for frequency. Postman and Schneider 
also found a significant frequency by 
value interaction, which they interpreted 
to mean that value rank is a threshold 
determinant for low but not for high 
frequency words. Solomon and Howes, 
however, found only a subject by fre- 
quency interaction. They felt that the 
data of Postman, Bruner, and McGinnies 
(1948), and of Postman and Schneider 
(1951) arose from the subjects’ in- 
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creased exposure to or use of high value 
words. i 

In 1952 Solomon joined with Postman 
(Solomon & Postman, 1952) to show 
that visual duration thresholds for seven- 
letter nonsense words were a negatively 
accelerated decay function of the num- 
ber of times the subjects had previously 
pronounced the words. The experi- 
menters, in response to McGinnies’ 
(1950) objection that taboo words are 
more frequently used than their Thorn- 
dike-Lorge frequencies suggest, pointed 
out that the frequency of a word's ap- 
pearance in print is more germane to its 
visual threshold than is its frequency 
in conversation. A recent experiment by 
Zigler and Yospe (1960) supported 
McGinnies by showing that the subjects' 
judgments of the frequency and familiar- 
ity of words were related to Thorndike- 
Lorge frequency for pleasant but not for 
unpleasant words, However, Solomon 
and Postman's point was substantiated 
by Sprague (1959), who found that 
visual thresholds were related to the 
number of times the subjects saw non- 
sense syllables, but not to the number 
of times they heard them. 

It was apparent from the Solomon and 
Postman study (1952) that Postman was 
having second thoughts about his earlier 
work, and in that same year McGinnies 
(with Comer & Lacey) joined the First 
Wave of critics by replicating Howes 
and Solomon's finding (1951) of a nega- 
tive relationship between visual duration 
threshold and Thorndike-Lorge word 
frequency. King-Ellison and Jenkins 
(1954) and Baker and Feldman (1956) 
provided replications of Solomon and 
Postman's finding (1952) that thresh- 
olds were negatively correlated with the 
frequency of nonsense syllables. ‘The- 
frequency-threshold relationship was 
also noted by DeLucia and Stagner 
(1954), Eriksen and Browne (1956), 
Fulkerson (1957), and Taylor (1958). 
By 1958 it had attained sufficient status 
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to be set forth in a general scientific 
journal with suggestions for practical ap- 
plications (Rosenzweig & Postman, 
1958). 

In addition, the influence upon identi- 
fication thresholds of such variables as 
word length (McGinnies, Comer, & 
Lacey, 1952) and recency (Eriksen & 
Browne, 1956; Postman & Solomon, 
1950) was demonstrated, as was the in- 
fluence of stimulus frequency upon 
familiarity judgments (Arnoult, 1956; 
Noble, 1954). 

The frequency-threshold relationship 
was, of course, not without its critics. 
Howes and Solomon’s data (1951) 
showed marked curvilinearity and it was 
pointed out (Adams & Brown, 1953; 
Lazarus, 1954) that Thorndike-Lorge 
frequency appeared to affect duration 
thresholds only for very low frequency 
words. In many subsequent studies using 
Thorndike-Lorge frequencies (e.g., Mc- 
Ginnies et al., 1952) only linear regres- 
sion lines were presented, and it was 
therefore not always possibie to tell if the 
underlying relationship was in fact 
linear. Inspection of Solomon and Post- 
man’s data (1952) for nonsense words 
with frequencies from 0 to 25 suggests 
that inclusion of the zero category is re- 
sponsible for the obtained relationship 
between frequency and threshold. When 
the zero category is disregarded, it is 
seen that the significant Fs must have 
resulted from mean differences in a di- 
rection contrary to the hypothesis. 
Replications of the Solomon and Post- 
man study (Baker & Feldman, 1956; 
King-Ellison & Jenkins, 1954) avoided 
these objections, but it became clear that 
stimulus frequency influences identifica- 
tion thresholds only within a limited 
frequency range. 

Meanwhile an even more serious ob- 
jection was developing to the notion that 
perception is affected by stimulus fre- 
quency, Early studies of the influence 
of frequency upon thresholds tended to 


treat frequency as if it were an inherent 
property of stimuli, comparable to length 
(Henle, 1942). Somewhat later, experi- 
menters began to speculate on how 
frequency might operate to sensitize the 
perceptual system (Howes & Solomon, 
1951), It became increasingly apparent, 
however, that stimulus frequency affects 
responses rather than perception per se. 
As early as 1952, for example, Solomon 
and Postman suggested that when only 
tachistoscopic fragments of stimuli are 
seen, subjects will tend to “guess” high 
frequency words. In later writings 
though (Postman & Rosenzweig, 1956), 
Postman stated that the more frequent 
the stimulus, the smaller the fragment 
necessary for identification, which im- 
plies that frequency does not reduce 
thresholds merely by raising response 
probability, 

Eriksen and Browne (1956) presented 
a direct statement of the idea that fre- 
quency (as well as recency, set, punish- 
ment, etc.) appears to affect perception 
simply because it increases response 
strength. Isolated bits of data relevant 
to this hypothesis were to be found in 
the literature; it was noted, for example, 
that word associations and incorrect 
word identifications tended to be high in 
frequency (Johnson, 1956; McGinnies 
et al., 1952; Rosenzweig & Postman, 
1957). And, of course, there have al- 
ways been those in the Old Look tradi- 
tion who hold that any influence of ex- 
periential variables such as frequency 
upon perception is more apparent than 
real (e.g., Pratt, 1950; Zuckerman & 
Rock, 1957). 

The stage was clearly set for the 
Goldiamond and Hawkins (1958) study. 
After giving the subjects differential 
training with nonsense syllables, these 
investigators asked for subliminal identi- 
fications at very rapid tachistoscopic ex- 
posure speeds. Using a psuedoascending 
method of limits, they were able to show 
the typical frequency-threshold relation- 
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ship despite the fact that no syllables 
were ever presented in the tachistoscope. 
The Goldiamond and Hawkins study, 
then, provided fairly strong evidence for 
the hypothesis that stimulus frequency 
affects responses rather than perception. 

It has been suggested (Casperson, 
1950) that such response bias might ac- 
count for the few studies (e.g., Hoch- 
berg, Gleitman, & MacBride, 1948) 
which have found lower thresholds for 
*good" forms, such as the circle. In 
support of this idea, a number of experi- 
menters (e.g., Bridgen, 1933) have re- 
ported that at very rapid tachistoscopic 
speeds, good gestalten responses are dis- 
proportionately frequent. Soltz and 
Wertheimer (1959) present a similar 
explanation for their finding that good 
forms are more easily recognized than 
bad ones. On the other hand, Hochberg 
and McAlister (1953) propose using re- 
sponse bias as a measure of stimulus 
goodness, 


RESPONSE SUPPRESSION AND SET 


The First Wave of critics had offered 
another major objection to New Look 
experiments in general and to dirty word 
studies in particular. This second objec- 
tion (Howes & Solomon, 1950) was that 
subjects might consciously delay report- 
ing taboo words and thus obtain spuri- 
ously high thresholds for them. 

McGinnies (1949) had reported that 
his subjects denied response suppression, 
but numerous studies were undertaken 
to test the response suppression hypothe- 
sis more rigorously. Whittaker, Gilchrist, 
and Fischer (1952) found that Negro 
subjects admitted response suppression 
for racially scurrilous words to Negro 
experimenters but not to white experi- 
menters. Cowen and Beier (1954) and 

Freeman (1955) found that perceptual 
defense could be demonstrated when the 
subject and the experimenter were of the 
same sex, and presumably less inhibited 
in reporting, as well as when the subject 
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and the experimenter were of opposite 
sexes. Zigler and Yospe (1960) found 
no differences in the perceptual defense 
effect between subjects who verbally 
identified the stimuli and subjects who 
simply checked them as pleasant or un- 
pleasant. 

By far the most common approach 
to the response suppression hypothesis, 
however, was to determine whether or 
not threshold differences between taboo 
and neutral words disappeared with 
forewarning or set. Cowen and Beier 
(1950) found that subjects who were 
forewarned showed no difference in 
threshold between threat and nonthreat 
words, although later experiments by 
these investigators failed to confirm the 
finding (Beier & Cowen, 1953; Cowen & 
Beier, 1954; Cowen & Obrist, 1958). - 
Lacy, Lewinger, and Adamson (1953) . 
and Postman, Bronson, and Gropper - 
(1953) found that subjects who expected 
taboo words showed perceptual vigilance, 
or reduced thresholds, rather than per- 
ceptual defense, Perceptual vigilance un- 
der these conditions was also shown by 
Freeman (1954, 1955), who added that 
subjects who expected taboo words but 
were actually shown nontaboo ones 
freely made obscene guesses. In addi- 
tion, Freeman replicated earlier work by 
Bitterman and Kniffin (1953) who had 
found that thresholds for successive 
taboo words dropped even though sub- 
jects had not been forewarned. The same 
drop in identification threshold as à 
function of the order of the taboo word 
can be seen in McGinnies’ original data 
(1949), 

Wiener (1955) provided another 
demonstration of perceptual vigilance in 
a study which related thresholds for 
Such words as ramy, presented in a 
sexual context, to thresholds for the 
same words presented in a neutral con- 
text. An experiment by Smith (1954); 
which showed that subjects who learned 
a list of hostile words subsequently had 
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lower thresholds for hostile stimuli than 
for neutral ones, has also been con- 
sidered germane to the response suppres- 
sion hypothesis. 

On the basis of such experiments as 
the foregoing, many concluded that the 
perceptual defense effect could be ex- 
plained in terms of "differential readi- 
ness to report" (Bitterman & Kniffin, 
1953). It was felt that these experiments 
showed that “when Ss are told what 
to expect, they have less reason to sup- 
press responses (Lysak, 1954). This 
conclusion was drawn even by some 
whose previous work suggested an alter- 
native explanation, Thus, when Post- 
man (Postman, Bronson, & Gropper, 
1953) accepted the response suppression 
explanation for perceptual defense, he 
largely neglected the fact that he and 
Bruner (Bruner & Postman, 1949) had 
already demonstrated a very similar pro- 
gressive reduction in thresholds with in- 
creasing knowledge, using "trick" play- 
ing cards such as a red three of spades. 

Although some of these experimenters 
(e.g., Fulkerson, 1957) related the re- 
sponse suppression hypothesis to the vast 
literature on set in general (viz., Gibson, 
1941), most did not. And so here again, 
as with the frequency-threshold hy- 
pothesis, the critics of the New Look 
came in for criticism. 

Many studies have appeared which 
demonstrate the effects of expectancy, 
context, labels, reduction in stimulus or 
response populations, etc., upon repro- 
duction of stimuli (Braly, 1933; Car- 
michael, Hogan, & Walter, 1932), identi- 
fication of stimuli (Cofer & Shepp, 1957; 
Krulee, Podell, & Ronco, 1954; Miller, 
Heise, & Lichten, 1951; Postman & 
Brown, 1952; Siipola, 1935; Taylor, 
1956), and judgment of stimuli (Hill, 
1953). Often so-called practice effects 
are understandable in terms of set 
(Bruner, Miller, & Zimmerman, 1955), 
along with such phenomena as increased 
accuracy in perceiving incongruities with 
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successive trials (Bruner & Postman, 
1949; Smock, 1955b). 

At about the same time that response 
suppression was being explored, the old 
question of Response versus Perception 
was overtaking the set experimenters. 
The first widely cited study was that of 
Lawrence and Coles (1954), who 
showed that providing subjects with re- 
sponse alternatives after the stimulus 
presentation facilitated identification to 
the same degree that providing response 
alternatives both before and after the 
stimulus did. In a second experiment, 
Lawrence and LaBerge (1956) found no 
difference in accuracy between subjects 
who were instructed to note all three 
dimensions on which a set of stimuli 
varied (form, color, and number) and to 
report these dimensions in a specified 
order, as compared to subjects who were 
instructed to attend to only one dimen- 
sion but to report all dimensions in any 
order they chose. Lawrence and his co- 
workers concluded that set influenced re- 
sponse variables rather than truly facili- 
tating the perceptual process. 

In a subsequent study, Fulkerson 
(1957) found that frequency and ex- 
pectancy interacted in such a way that 
expectancy decreased thresholds for in- 
frequent words but not for frequent 
ones, Recent experiments by Long et al. 
(Long, Henneman, & Garvey, 1960; 
Long, Reid, & Henneman, 1960; Reid, 
Henneman, & Long, 1960) provided 
mixed support for the response variable 
hypothesis, since they showed that re- 
sponse alternatives presented after stim- 
uli, were as effective as alternatives 
presented both before and after stimuli, 
except when the number of alternatives 
was small, Brown (1960) also cast 
some doubt upon the response variable 
explanation of set by showing that giv- 
ing instructions in advance, as opposed 
to giving them during the stimulus 
presentations, had no effect upon sub- 
jects’ reports of position or color alone, 
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but increased the accuracy of nam- 
ing alone and of reporting all three 
dimensions. The before-after meth- 
odology used in most of these studies 
of set has been criticized on the grounds 
that providing response alternatives 
after stimulation may interfere with 
optimal recall. Sperling (1959), for 
example, has shown that memory for 
tachistoscopically presented material di- 
minishes rapidly with time. Brown 
(1960) attempted to eliminate this ob- 
jection by using a before-during method, 
but it is possible that providing response 
alternatives during stimulation has a 
detrimental effect on encoding. 

As will be discussed presently, the ap- 
propriateness of the chance correction 
employed by Lawrence, Long, and others 
concerned with set has been seriously 
questioned, and their conflicting results 
may in part be a reflection of it. In view 
of these problems, it would perhaps be 
fruitful to approach the problem of set 
as a response variable in the same way 
that Goldiamond and Hawkins (1958) 
investigated frequency as a response 
variable. 

Another technique which holds prom- 
ise for systematic study of the effects 
of set (as well as of recency and fre- 
quency) upon perception was em- 
ployed in 1960 by Epstein and Rock. 
They presented subjects with ambiguous 
figures of the Wife: Mother-in-Law type, 
preceded by varing experience with one 
or both of the components. For example, 
they told one group of subjects to ex- 
pect to see two figures and then pre- 
sented at a rapid rate the sequence 
WWW, followed by the composite W/M. 
In this way, frequency and recency were 
pitted against expectancy. Their overall 
conclusion was that recency exercises a 
strong influence upon perception of the 
composite, frequency somewhat less so, 
and expectancy little if at all. They 
also suggested examination of the role 
of specific memory traces. Their study 


suffers from a lack of consideration of 
effects such as negative frequency 
(Bruner & Wechsler, 1958) and like- 
response sequences (Senders & Sowards, 
1952; Verplanck, Collier, & Cotton, 
1952) and also from the apparently arbi- 
trary exclusion of subjects who could 
not readily verbalize their expectancy, 
but it does offer a nontachistoscopic ap- 
proach which is free of the chance cor- 
rection problem. 


RESPONSE VERSUS PERCEPTION 


It is quite apparent as we follow the 
New Look experimenters and their First 
Wave of critics that concern with isolat- 
ing response variables from perceptual 
ones was spreading. Even while Ammons 
(1954) was suggesting that attempts to 
separate perception and responses would 
lead only to “fruitless dispute,” tech- 
niques were being proposed to study 
just this issue. 

Neisser (1954), in a study of set, 
showed the subjects a list of 10 words 
and told them that some of the words 
would later be presented tachistoscopi- 
cally. The words actually presented in 
the tachistoscope fell into three groups: 
five “set” words which had appeared on 
the list, five homonyms of words which 
had appeared on the list, and five control 
words which had not appeared on the 
list. Neisser found that his subjects 
identified the set words significantly 
more readily than they did either the 
homonyms or the control words. He 
concluded that since the verbal response 
for naming a homonym is identical with 
that for naming its corresponding set 
word, set facilitates the perceptual proc- 
ess rather than verbal responses. 

Neisser’s experiment has been criti- 
cized (Goldiamond, 1958) on the 
grounds that the word pairs used (e£ 
phrase-frays) differed in frequency- 
Since responses were verbal and the de- 
sign was counterbalanced, this criticism 
appears to be invalid. However, Neisser's 
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experiment does suffer from a major 
flaw. It is obvious that a fragmentary 
percept of SCE will be more likely to 
elicit the response “scene” if the subject 
has been shown a list on which SCENE 
appears than if the list contains no simi- 
lar word. It is equally obvious that if 
the list contains instead the word SEEN, 
the subject who perceives SCE might be- 
come confused or wary (cf. Bruner & 
Postman, 1949) and defer responding. 
Ross, Yarczower, and Williams (1956) 
have demonstrated that subjects who 
expect to see one member of a pair of 
homonyms show higher thresholds for 
the other member of the pair than they 
do for completely unrelated words which 
they also do not expect to see. This 
finding, they add, does not result from 
subjects! responding to the homonym 
with its corresponding set word. This 
means that while Neisser Aas shown 
that set does not reduce thresholds 
simply by increasing response probabil- 
ity, he has not shown that set operates 
to facilitate the perceptual system. 

Shortly after Neisser's study appeared, 
Garner, Hake, and Eriksen (1956) pre- 
sented a cogent discussion of the Re- 
sponse versus Perception issue, along 
with examples of convergent operations 
by means of which the two might be 
separated. Eriksen  (1958a) subse- 
quently made use of a technique derived 
from information theory (Shannon & 
Weaver, 1949) to show that providing 
subjects with feedback operated pri- 
marily upon their choice of responses 
and only in minor way upon their ability 
to perceive the stimuli. 

Collier and his associates ( Brackmann 
& Collier, 1958; Collier & Verplanck, 
1958; Verplanck et al., 1952) studied 
serial effects in the brightness detection 
situation and found that subjects tended 
to give series of either Yes or No re- 
sponses (cf, Senders & Sowards, 1952), 
that subjects tended to respond Yes more 
often to trials which followed an inter- 
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polated bright stimulus (cf. Schafer, 
1950), and that subjects tended to re- 
solve uncertainties by saying Yes during 
ascending presentations and No during 
descending presentations, Goldiamond 
(1955) and Conklin and Sampson 
(1956) confirmed the nonperceptual na- 
ture of such effects by showing their 
dependence upon methods of measure- 
ment. Bruner and Wechsler (1958) 
demonstrated similar effects when sub- 
jects were asked whether series of stimu- 
lus figures were open or closed. 


ELECTRIC SHOCK EXPERIMENTS 


Several experiments have used electric 
shock rather than existent anxieties or 
values in studying perceptual defense 
and vigilance. Lysak (1954) reported 
that under punishment conditions higher 
thresholds were found for shocked syl- 
lables, but under subsequent no-punish- 
ment conditions sensitization (vigilance) 
for previously shocked syllables oc- 
curred. He interpreted this finding as 
support for the response suppression ex- 
planation of perceptual defense. In a 
study by Reece (1954), three groups 
were formed: one that could avoid shock 
by making appropriate responses, one 
that could not avoid shock, and one that 
received no shock. The no-escape group 
showed significantly higher thresholds 
than either the escape or the no-shock 
groups, and the difference between the 
last two groups was not significant. 
Rosen (1954) found that shocking in- 
correct identifications resulted in “per- 
ceptual sensitization,” while random 
shocks produced “perceptual disrup- 
tion.” : 

In these shock studies the possible 
role of such variables as differential mo- 
tivation, feedback, and response stereo- 
typy was generally overlooked. An ex- 
periment by Zeitlin (1954) shed some 
light by showing that associating words 
with punishment (a sudden raucous 
noise) or with reward (money) lowered 
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their tachistoscopic thresholds, even 
when no words were shown tachistoscopi- 
cally (cf. Goldiamond & Hawkins, 
1958). Smock (1955a) contributed a 
study which showed that subjects ad- 
hered longer to previous perceptual hy- 
potheses under stress conditions. 
Eriksen and Wechsler (1955) sharp- 
ened earlier work by Rosenbaum (1952) 
and by Eriksen (1954) with a demon- 
stration that shock influences judgments 
of square size by increasing the fre- 
quency of preferred responses. As a re- 
sult of such investigations, studies utiliz- 
ing shock to produce perceptual defense 
and vigilance (e.g, Dulany, 1957) 
tended to become more circumspect and 
to explain their findings in response 
rather than perceptual terms. 


RESPONSE STEREOTYPY 


One of the few determinants of identi- 
fication thresholds that was originally 
proposed as a response effect rather than 
as a perceptual one was demonstrated by 
Wyatt and Campbell (1951). They 
showed that the ascending method of 
limits resulted in higher form identifica- 
tion thresholds than did presentation of 
the stimuli at isolated random speeds, 
presumably because subjects under the 
former method clung to erroneous hy- 
potheses, Blake and Vanderplas (1950) 
found that subjects who offered incorrect 
responses prior to identification obtained 
significantly higher auditory thresholds 
than subjects who did not offer such re- 
sponses. Little interest has been shown 
in these findings, despite the fact that 
they appear to conflict in some respects 
with popular theories on the effects of 
preidentification hypotheses (Bruner, 
1957; Postman, 1953). 


VALUES AND EMOTIONAL TONE 


There has also been relatively little 
interest shown in the effect of values and 
emotional tone upon word thresholds, 
possibly because the early studies (Aron- 
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freed, Messick, & Diggory, 1953; Gil- 
christ, Ludeman, & Lysak, 1954; Post- 
man, Bruner, & McGinnies, 1948; 
Postman & Schneider, 1951) were 
swamped by the First Wave of criticism 
(Goodstein, 1954; Solomon & Howes, 
1951). Newton’s (1955) subjects made 
more errors for unpleasant than for 
pleasant words, even with frequency 
controlled. Osgood (1957) suggested 
that perceptual defense and vigilance 
effects might result from fairly gross 
detection of the emotional *meaning" of 
stimuli prior to their identification, but 
this possibility has been weakened both 
logically and empirically by Eriksen 
(Eriksen, 1958b, 1960; Eriksen, Azuma, 
& Hicks, 1959). Eriksen, Azuma, and 
Hicks also found that long words tended 
to be judged as unpleasant. 

Johnson, Thomson, and  Frincke 
(1960) have recently presented results 
which suggest that word frequency and 
value judgments are related. They used 
the semantic differential and showed: 
(a) significant rank order correlations 
between the Thorndike-Lorge frequency 
of randomly selected words and their 
judged pleasantness, (b) that subjects 
tended to circle as more pleasant the 
frequent member of pairs of words 
chosen to reflect value areas (Solomon & 
Howes, 1951), (c) that nonsense sylla- 
bles high in association value were rated 
as more pleasant, (d) that the frequency 
of nonsense words such as those used by 
Solomon and Postman (1952) was 
positively correlated with their rated 
pleasantness, (e) that pleasant words 
had lower tachistoscopic identification 
thresholds than did unpleasant ones of 
the same frequency, and (f) that fre- 
quent words had lower thresholds than 
infrequent words of the same degree of 
judged pleasantness. Johnson and his” 
co-workers concluded from these find- 
ings that past reinforcement facilitates 
perception. They went on to say, with 
particular reference to e above, that 
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no response bias was apparent. How- 
ever, the results obtained lend them- 
selves so well to a response probability 
interpretation that more investigation 
seems called for. If, as Epstein and Rock 
(1960) suggest, frequency is a more 
potent determinant than expectancy, the 
Johnson study may explain why some 
studies of forewarning find that even 
subjects who expect to see unpleasant 
words show lower thresholds for pleasant 
ones, 


SuBLIMINAL PROCESSES AND 
PARAMETERS OF THRESHOLD 


In addition to New Look studies which 
rely upon demonstrations of threshold 
differences, there are those which postu- 
late subliminal processes. The first well- 
known experiment in this genre was that 
of Lazarus and McCleary (1951), where 
it was shown that galvanic skin responses 
were higher for shock than for nonshock 
words on trials where verbal identifica- 
tion of the words was incorrect. Mc- 
Ginnies (1949) was apparently the first 
to demonstrate such galvanic skin re- 
sponse differences, but Lazarus and Mc- 
Cleary, who coined the term “subcep- 
tion," are generally credited with the 
discovery. Like its threshold-difference 
predecessors, the Lazarus and McCleary 
study provoked a good deal of criticism. 

Bricker and Chapanis (1953) pointed 
out that attributing Lazarus and Mc- 
Cleary’s results to “subception” rests 
upon the assumption that verbal identifi- 
cation is an all-or-none affair and no 
information is conveyed by stimulus 
presentations which are incorrectly iden- 
tified. They tested this assumption by 
presenting words at short tachistoscopic 
durations and requiring subjects to 
guess from a word list until the experi- 
menter said “correct.” For one set of 
words, the experimenter said “correct” 
when the subject gave a word its proper 
name. For another set of words, which 
did not appear on the subject’s list, the 
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experimenter said correct when the sub- 
ject gave a name which had been arbi- 
trarily chosen from the word list as the 
correct response. Bricker and Chapanis 
found that when responses were initially 
incorrect, significantly fewer trials were 
needed to identify the words than to 
guess the arbitrary correct responses. 
They argued that the subception phe- 
nomenon stems from subjects’ partial 
perceptions of the stimuli. 

Murdock (1954) reported a similar 
study in which he showed that the num- 
ber of incorrect guesses prior to identi- 
fication of the words used by Lazarus 
and McCleary (1951) varied with the 
level of illumination. His subjects said, 
in response to questioning, that when 
they could not see the word as a whole, 
they tried to use individual letters in 
their guesses. 

Howes (1954) presented a probability 
model for the subception effect, and 
Eriksen (1956a, 1956b) offered an ex- 
planation in terms of partially correlated 
response systems, along with evidence 
that thresholds derived from galvanic 
skin response data are, if anything, less 
sensitive than those obtained from verbal 
reports (Dulany & Eriksen, 1959). 

Voor (1956) showed that subjects in 
the subception situation tended to use 
nonshock syllables when in doubt. He 
concluded that the subception effect de- 
pended upon information in presenta- 
tions where subjects were “guessing.” 

Boardman (1957) added still more 
such evidence by showing that subjects 
were able to identify letters well below 
the threshold for complete word identi- 
fication, and that they made use of these 
letters in preidentification responses. 

Renewed exploration of the possibility 
of below-threshold effects has coincided 
with contemporary interest in subliminal 
advertising. Reviews of the extensive 
experimental literature have recently 
appeared (Adams, 1957; McConnell, 
Cutler, & McNall, 1958), as have meth- 
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odological critiques (Eriksen, 1958b, 
1960; Goldiamond, 1958). 

Shevrin and Luborsky’s (1958) re- 
vival of the Poetzl phenomenon is cur- 
rently a prime example for those who 
believe in subliminal effects. They 
showed that the dreams of subjects who 
had briefly seen and described a compli- 
cated picture tended to contain the un- 
reported elements of the picture. John- 
son and Eriksen (1961) replicated the 
Poetzl effect in their subjects’ images, 
but since even subjects who had not 
seen the picture showed the effect, they 
concluded that the Poetzl phenomenon is 
nonperceptual. 

Perhaps the most widely cited recent 
work on subliminal effects is that of 
Klein and his co-workers. The “limen” 
referred to in these studies, however, is 
typically that for spontaneous report. 
Thus a neutral face was shown to psy- 
chiatric patients, along with the super- 
imposed words HAPPY or ANGRY (Smith, 
Spence, & Klein, 1959), and descriptions 
of the face varied accordingly. Smith 
et al. state that the “verbal stimuli were 
considered subliminal if they fell below 
the threshold of spontaneous report,” 
but they tend nevertheless to claim that 
the words were exposed below “the 
threshold for recognition” and that 
“none of the Ss consciously detected the 
word,” 

Other studies of “subliminal” influ- 
ence included admissions that the stimuli 
may not have been subliminal (Klein, 
Spence, Holt, & Gourevitch, 1958), and 
some even chose stimuli which were visi- 
ble to the experimenter but did not 
“distract” the subject (Goldstein & 
Barthol, 1960). Still other studies have 
used definitions of threshold such that 
accuracy is in excess of chance (Blum, 
1954; Nelson, 1955; Pustell, 1957; Ver- 
non & Badger, 1959). In short, it is clear 
that many so-called investigations of 
subliminal perception are in fact studies 
of incidental stimulation or of supra- 
liminal perception. 


A number of purported subliminal 
perception studies (e.g., Taylor, 1953) 
demonstrated merely that discrimination 
may be possible when identification is 
not. This possibility is plainly implied, 
not only by experiments of the Bricker 
and Chapanis type (1953), but also by 
early studies of the microgenetic charac- 
ter of perception. 

Flavell and Draguns (1957) summar- 
ized the work on microgenesis, or the 
sequential temporal development of per- 
ception. Using, by and large, tachisto- 
scopic techniques and free report or 
drawings, both German and American 
investigators (e.g., Bridgen, 1933; Free- 
man, 1929; Helson & Fehrer, 1932) 
agreed that there are definite stages to 
form perception. Generally, studies de- 
scribed three or four stages: a vague, 
undifferentiated whole; some measure 
of figure-ground differentiation, with 
amorphous details; a tentatively formed 
object which could not be identified; 
and correct identification. The work of 
Krauskopf, Duryea, and Bitterman 
(1954) showed that considerably more 
illumination is necessary to identify a 
form than to detect its presence. 

Just as the distinction between dis- 
crimination and identification is often 
crucial to subliminal experiments, so is 
the distinction between phenomenal re- 
port (the subject indicates Yes or No for 
each stimulus interval) and forced re- 
port (the subject indicates where in 
time or space a stimulus occurred). 
Adams (1957) has pointed out that one 
of the most reliable ways to demon- 
strate subliminal effects is to compel 
subjects to attempt discriminations oF 
identifications that they feel are just 
guesses, 

Blackwell (1952, 1953) did a series 
of experiments comparing methods of 
data collection in a brightness detection 
task. He found that the phenomenal 
report method produced higher thresh- 
olds than the forced report method, 
after corrections for chance. In a manner 
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of speaking, Blackwell's subjects showed 
subliminal perception, since they were 
able to locate the light when they could 
not "see" it. 

Both Goldiamond (1958) and Erik- 
sen (1958b) have called attention to the 
relevance of Blackwell's data for much 
of the work on subliminal perception, 
and Eriksen particularly has argued that 
the difference between subjects' forced 
report and Yes-No thresholds reflects 
their *subjective confidence" level and 
is subject to many extraneous influences. 
Smith and Wilson (1953) have shown 
marked shifts in Yes-No thresholds for 
auditory detection, depending upon 
whether subjects were instructed to be 
“liberal” or “conservative.” 

The above investigations of forced 
report and Yes-No methods have all 
used a detection task. As applied to form 
identification, however, the forced re- 
port method simply requires the subject 
to make a response, as opposed to the 
free report method, where the subject re- 
sponds only when he sees the form. A 
search of the form identification litera- 
ture revealed only one study in which 
the two methods were directly compared. 
Lysak (1954) found that forced report 
identification thresholds for words were 
lower than free report thresholds, but 
his subjects were subjected to electric 
shock, and it is probable that they 
avoided responding more often than 
might otherwise have been the case. It 
has often been pointed out (Gibson, 
1951; Hake, 1957) that the problems of 
measuring form thresholds are especially 
complex, and further investigation of 
the free and forced report methods of 
obtaining form identifications appears 
to be needed, 

Many of the form studies which have 
been cited (Bricker, 1955; Casperson, 
1950; Eriksen et al., 1959; Lawrence & 
Coles, 1954) relied upon a chance cor- 
rection known as Abbott’s formula (Fin- 
ney, 1947). The most typical form is: 
P,—(P,—P,)/(1—P;); where P,—the 
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corrected proportion, P,=the observed 
proportion of correct or positive re- 
sponses, and P,=an estimate of the pro- 
portion of correct or positive responses 
obtained by chance. The principal as- 
sumption is that guesses are independ- 
ent of responses due to sensory dis- 
crimination, 

However, a number of investigators, 
working with visual and auditory detec- 
tion models derived from decision theory, 
have shown this assumption to be false. 
Smith and Wilson (1953) showed that 
subjects instructed to be "liberal" in 
the detection of auditory signals did 
significantly better, after correction for 
chance, than “conservative” subjects. 
Tanner and Swets (1954), in a Yes-No 
light detection situation, demonstrated 
that the false alarm rate is correlated 
with corrected thresholds. Birdsall 
(1955) suggested that the conventional 
threshold, far from being independent 
of false alarm rates, is "strictly a mono- 
tonic function" of them. Tanner (1955) 
described visual and auditory detection 
experiments in which subjects using 
forced report methods made more cor- 
rect second choices than the guessing 
hypothesis would predict. . 

The implication of such results for 
the notion of a detection threshold is 
obvious. In the present discussion, how- 
ever, the question is how these results 
apply to form thresholds. 

Hyman and Hake (1954) showed 
that subjects required to identify a 
briefly presented circle, cross, diamond, 
or square made systematic errors; for ex- 
ample, when the square was incorrectly 
identified, it was more often called a 
circle than it was called a cross or dia- 
mond, The work of Bricker and Cha- 
panis (1953) and of Murdock (1954) 
indicated that incorrect identifications 
of words are related to the particular 
stimulus that is shown. There is little 
doubt, then, that incorrect identifications 
are not random; at speeds below the 
identification threshold, a stimulus such 
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as INSET looks more like INSERT or IN- 
SECT than it looks like psycHotocy. But 
it is a far less serious matter when sub- 
jects respond systematically below an 
identification threshold than it is when 
they respond systematically below a de- 
tection threshold. The former, as we 
have discussed, can be explained by par- 
tial cues; the latter cannot be explained 
at all without doing violence to the con- 
notations of the term “detection.” 

On the other hand, such findings, as 
Hake (1957) has pointed out, greatly 
restrict the generality of form thresh- 
olds, which can be expected to vary with 
the size of the stimulus population and 
with the nature of the forms used. It 
is impossible to discover “the” threshold 
for a circle, or even to state with as- 
surance that the circle has a “low” 
threshold, without specifying the re- 
maining forms. And adding to the am- 
biguities of threshold determination for 
form is the dependence of threshold 
measures upon the criteria adopted. Hel- 
son and Fehrer (1932), for example, 
showed that threshold for correct per- 
ception of form was much higher than 
threshold for first form perceived with 
the ascending method of limits. 

The last of the major New Look tech- 
niques to be considered was introduced 
by Blum (1954). His subjects were 
shown panels of four unidentified pic- 
tures (from the Blacky Test) at a 
tachistoscopic exposure “below conscious 
awareness." After obtaining a baseline 
series of judgments as to which picture 
“stood out most," Blum structured one 
picture as traumatic (e.g. “This is 
Blacky licking his sexual parts") and 
another as neutral (e.g., “This is Blacky 
chasing flies"). Subsequent judgments 
as to which picture stood out most 
shifted in the direction of the traumatic 
picture (vigilance). The subjects were 
then shown the same four pictures at a 
slower, but still subawareness, speed and 

asked to locate alternately the trau- 
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matic and the neutral picture. Here 
perceptual defense occurred; subjects 
located the traumatic picture less accu- 
rately than the neutral one. 

Both Blum and those who have used 
his method define conscious awareness 
in a liberal way. Nelson (1955) re- 
ported that pictures shown “below the 
threshold of clear conscious recognition” 
were identified significantly better than 
chance. Pustell (1957) defined thresh- 
old as 62% accuracy in a situation 
where only 25% of the pictures should 
be identified by chance. In Blum’s 
original experiment (1954), almost all 
subjects at the lowest level of awareness 
felt that they were looking at pictures 
of elephants, which suggests a good deal 
of leeway for discrimination among the 
pictures. At the slower, but still sub- 
awareness, exposure speed, Blum re- 
ported that pilot subjects “very fre- 
quently recognized one or more of the 
(four) pictures.’ Tests run by the 
author on Blum’s location data show 
that at this speed even the least accu- 
rately located pictures (those against 
which subjects were presumably de- 
fending) were located significantly 
more often than chance expectancy. 

Attempts to replicate Blum’s percep- 
tual defense effect (Raskin, 1954; 
Smock, 1956) have generally been un- 
availing, but the vigilance effect has 
been obtained by other investigators. 
Smock (1956) was probably the first 
to suggest that Blum’s vigilance phe- 
nomenon was a function of differential 
emphasis on the two pictures, This sug- 
gestion has not been experimentally 
pursued, but it is plausible and parsi- 
monious. The author (Pierce, 1961) 
has shown that at very rapid tachisto- 
scopic durations, subjects have time to 
look at only one of two simultaneously 
presented stimulus words. This finding 
probably explains how Blum’s subjects 
made clearness judgments between 
stimuli that did not vary in “clearness”; 
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they judged as clearest the stimulus they 
had time to see. Further study of the 
determinants of which stimulus is 
looked at in such situations is obviously 
needed. 

In some respects, studies of form 
threshold have come full circle. The 
Old Look theorists were primarily Ge- 
stalt psychologists who emphasized the 
stimulus determinants of perception. 
The New Look experimenters, with 
their concern for inner parameters, were 
dismissed by the Old Look on the 
grounds that the New Look was not 
studying perception at all, but rather 
interpretation or meaning. In contrast, 
the First Wave of critics, as we have 
called them, did not often doubt that 
the New Look dealt with perception; 
their argument was that more general 
determinants were involved. 

Much of the recent thinking about 
the New Look, and about form percep- 
tion generally, agrees with the Old Look 
position that variables such as need, 
set, and frequency do not affect percep- 
tion as such. It has become increasingly 
clear that many threshold parameters 
act upon the response system instead 
of upon the perceptual system. The 
observer, faced with a complex world— 
or with an ambiguous tachistoscopic 
stimulus—and relying upon a limited 
visual system (cf. Gibson & Gibson, 
1955; Hake, 1957), must supplement 
his perceptions according to what has 
happened in the past and whether he 
can expect it to happen again. For the 
future, there is a great deal to be 
learned about the effects of need, set, 
frequency, and so forth upon responses, 
and about the conditions under which 
response variables play a greater or 
lesser role in the perceptual situation. 
As sophistication continues to increase, 
particularly about such concepts as 
threshold, we can look forward to real 
advances in our knowledge of form 
perception and its determinants. 
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JUDGMENT OF PERSONAL CHARACTERISTICS AND 
EMOTIONS FROM NONVERBAL PROPERTIES 
OF SPEECH! 
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A person's changing emotional state and relatively stable personal 
characteristics may be judged from nonverbal properties of his voice. 
These properties include such elements as timbre, inflection, and stress, 
which accompany the actual words spoken but are not a direct part of 
them. Many studies have used inadequate measures as the independent 
criterion for the traits being judged, and no method of eliminating the 
verbal content has been wholly successful. The evidence does show, 
however, that some validity of judgment is possible. Acoustic analysis 
has been little used; it could increase the objectivity of studies. Individ- 
ual differences among listeners and the relationship of voice to psycho- 
pathology have been particularly neglected areas in research. 


In a fantasy novel by the poet Robert 
Graves, a man of the distant future asks 
a twentieth century Englishman, “Do I 
speak with correctitude?” “With great 
correctitude," he is assured, *but without 
the modulations of tone we English use 
to express, or disguise, our feelings" 
(Graves, 1949, p. 1). Allof us not only 
use such modulations ourselves, but 
also make judgments about others! cur- 
rent feelings and attitudes, as well as 
about more stable personal character- 
istics, partly on the basis of how they 
"sound" to us. Sullivan has stated 
(1954, p. 7) that these "sound ac- 
companiments suggest what is to be 
made of the verbal propositions stated." 
Whether or not we can interpret them 
correctly, whether or not speaker and 
listener would agree to their significance, 

1 This paper was prepared in part during 
tenure of a Predoctoral Fellowship from the 
National Institute of Mental Health, United 
States Public Health Service, and in part under 
Contract No. SAE-9265 with the Language 
Development Section, United States Office of 
Education. 

The author wishes to express his appreciation 
to Harlan Lane for reading the manuscript and 
offering many pertinent criticisms and sug- 
gestions. 
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these “non-verbal but nonetheless pri- 
marily vocal aspects of the exchange" 
(Sullivan, 1954, p. 5) play an important 
part in the perception of persons. 
The first major experiment on impres- 
sions of persons based on voice alone 
(Pear, 1931) analyzed over 4,000 re- 
ports from British radio listeners who 
had responded to questions about nine 
different readers they had heard on the 
air. Age and sex proved easiest to esti- 
mate correctly. An actor and a clergy- 
man were most consistently identified 
correctly from among the nine profes- 
sions represented, The highest leader- 
Ship ratings were given to the speakers 
whose voices were professionally im- 
portant to them: an actor, a judge, 
and a clergyman. Birthplace of the 
speakers was not guessed with signifi- 
cant accuracy. Certain errors in guess- 
ing a speaker's profession showed sig- 
nificant consistency, suggesting that 
some voices provide a stereotype of a 
certain occupation even though this is 
not the actual occupation of this 
speaker. Such “vocal stereotypes” have 
remained the most frequent finding in 
all studies of the relationship between 
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voice and personality (Kramer, 1963). 

The studies which followed are divided 
here into two major categories: those 
which called for judgments from voice 
of relatively stable characteristics of an 
individual, and those which asked for 
judgments of emotional or affective 
variables which change over relatively 
short periods of time. Both kinds of 
judgments involve problems of separat- 
ing nonverbal aspects of the voice from 
the actual words spoken, and problems 
of adequate independent criteria for the 
traits being judged. Physical characteris- 
tics of an individual may usually be eas- 
ily measured; but aptitudes, interests, 
and personality traits present special 
problems. Such pencil and paper inven- 
tories as those by Bernreuter (1931) and 
Bell (1934), frequently used in such 
studies, have many limitations of their 
own (McKelvey, 1953; Tyler, 1953). If 
valid judgments are made from voice, a 
low correlation between such judgments 
and inventory scores might still be a fre- 
quent finding (Kramer, 1963). The 
problem of separating verbal and non- 
verbal aspects of speech is dealt with 
in the section below on voice and chang- 
ing emotional states. 


VorcE AND STABLE CHARACTERISTICS 
OF AN INDIVIDUAL 


Physical. Characteristics 


A speaker's age can apparently be 
judged with better than chance accuracy 
from his voice (Allport & Cantril, 1934; 
Herzog, 1933; Pear, 1931), but two of 
the studies (Allport & Cantril, 1934; 
Pear, 1931) found a tendency for esti- 
mates of age to center in the thirties. 
On judgments of height, both positive 
(Herzog, 1933) and negative results 
(Allport & Cantril, 1934; Cantril & 
Allport, 1935) have been reported. Voice 
and the overall appearance of a speaker, 
both in person and in photographs, have 
been matched with statistically signifi- 
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cant accuracy (Allport & Cantril, 1934). 
An attempt to match voices with photo- 
graphs of the Kretschmerian body types 
(Kretschmer, 1925) found the pyknic 
type matched most accurately, then lep- 
tosomatic, and finally athletic (Bona- 
ventura, 1935). Another study (Fay & 
Middleton, 1940c) asked judges to 
match speakers’ voices with paragraphs 
describing the three body types. The 
athletic type was matched no better than 
chance, while low positive correlations 
were found for the others. No actual 
morphological measurements of the 
speakers was made, but each was as- 
signed a Kretschmerian type merely on 
the basis of superficial appearance. Since 
such measurements were considered a 
necessary step in classification by 
Kretschmer (1925), the speakers may 
not have represented good examples of 
the classification given them. 

Certain types of brain damage alter 
aspects of an individual’s intonation 
pattern, which Monrad-Krohn (1947, 
1957) has referred to as “prosody.” He 
reported that these intonation changes 
give the listener the impression of a per- 
son speaking with a foreign accent. The 
importance of the pitch contour of 
speech—changing patterns in the funda- 
mental frequency—in conveying impres- 
sions of accent and language was demon- 
strated by Cohen and Starkweather 
(1961). They found that English speak- 
ing listeners could judge whether or not 
they were hearing a recording made from 
English speech, even after the recording 
had been passed through a low-pass fil- 
ter which removed all those higher fre- 
quencies required for recognition of 
words (French & Steinberg, 1947). 


Aptitudes and. Interests 


The one reported study on voice and 
intelligence (Fay & Middleton, 1940b) 
found a correlation of .33 between esti- 
mates of intelligence from voice and 
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speakers’ IQs on the Terman Group Test 
of Mental Ability (Terman, 1920). A 
voice in the “average intelligence” group 
which was consistently judged to be 
that of a person of very low intelligence 
led the authors to conclude, “Possibly 
all the ratings indicate voice stereotypes. 
The fact that some of them agree with 
test results of intelligence may be purely 
coincidental” (p. 190). In another 
study, Fay and Middleton (1943) had 
15 freshman fraternity men rated for 
leadership by 10 fraternity seniors who 
had known them for 6 weeks. These 
ratings showed no correlation with rat- 
ings of leadership based solely on the 
freshmen’s voices, although the inter- 
judge reliability of the voice judgments 
was .41. The authors feel that this de- 
gree of social agreement, in the face of 
no accuracy compared to the criterion, 
suggests the presence of vocal stereo- 
types of leadership. The presence of 
stereotyped voices is given by Allport 
and Cantril (1934) to explain the un- 
expected success their listeners had in 
judging speakers’ political preferences. 
Judgments of speakers’ dominant 
values were compared with the speakers’ 
scores on the Allport-Vernon Study of 
Values (Allport & Vernon, 1931) in two 
studies (Allport & Cantril, 1934; Fay 
& Middleton, 1939b), The earlier found 
mixed results, while Fay and Middleton 
found that the types “judged most ac- 
curately in terms of mean percentage 
superior to chance are: political, 46% ; 
esthetic, 29%; social, 23%” (p. 154). 
In another of their series of studies, Fay 
and Middleton (1939b) asked listeners 
to identify the vocations of speakers rep- 
resenting several vocations. Only the 
voice of a preacher was correctly iden- 
tified consistently better than chance, 
and it was frequently mistaken for that 
of a lawyer. Earlier studies on voice and 
vocation reported more positive results 
(Allport & Cantril, 1934; Pear, 1931). 


Personality 


This section is divided into judgments 
from voice of personality traits, of per- 
sonality in more global terms, and of 
personality adjustment and psycho- 
pathology. All three areas have suffered 
from a lack of adequate independent 
criteria for the success of vocal judg- 
ments. Attempts to specify what ele- 
ments of voice are responsible for the 
judgments have often only served to il- 
lustrate the lack, noted by Sapir (1927), 
of an adequate vocabulary for describ- 
ing voice. 

Personality traits. Judgments of 
dominance were found to be significantly 
correlated with scores on the Allport 
A-S Reaction Study (Allport & Allport, 
1928) by Allport and Cantril (1934), 
while Eisenberg and Zalowitz (1938) 
found no significant correlation between 
such judgments and their criterion meas- 
ure, the Maslow Social Personality In- 
ventory (Maslow, 1937). Fay and Mid- 
dleton found that listeners had no 
success in estimating either introversion- 
extraversion (Fay & Middleton, 1942) 
or sociability (Fay & Middleton, 1941) 
from speakers! voices, with the speakers' 
scores on the Bernreuter Inventory 
(Bernreuter, 1931) as the criteria. In 
both cases, however, they found signifi- 
cant interjudge agreement, which they 
interpret as providing evidence for the 
presence of vocal stereotypes. It should 
be remembered that more recent con- 
sideration of the Bernreuter test—which 
is among the better validated of the cri- 
teria used in voice judgment studies— 
has shown that artifactual correlations 
within it seem to permit clear measure- 
ment of considerably fewer factors than 
the number of labeled scales, and ad- 
justed and maladjusted subjects show 
much overlap on their test scores (Tyler, 
1953). 

Attempts to match both test scores 
and judgments on certain traits with 
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the particular voice qualities involved 
have been made by Moore (1939) and 
Mallory and Miller (1958). Moore 
(1939) found that subjects with a 
“breathy” tone of voice tended to be 
lower in dominance and higher in intro- 
version. Mallory and Miller (1958) 
report that Bernreuter scores on intro- 
version were negatively related to loud- 
ness, low pitch, and resonance in the 
voice. A “slight positive association” is 
reported between dominance scores and 
the voice qualities of loudness, resonance, 
and lower pitch. 

Neither study provides the exact 
acoustical data which would be required 
to objectify the voice quality terms (cf. 
Ostwald, 1960b) and permit cross-vali- 
dation of the relationships. The same 
problem exists with Stagner’s (1936) 
finding that flow, poise, and clearness 
in speech correlate positively with ag- 
gression and negatively with nervous- 
ness. At least part of the correlations 
may be spurious due to the fact that 
the same group of listeners made the 
judgments on both voice quality and 
the speakers’ personality characteristics. 

Personality as a whole. One of the 
earliest studies on judgments from 
voice (Allport & Cantril, 1934) found 
that, on the average, summary per- 
sonality sketches of the speakers could 
be more accurately matched with their 
voices than could any single quality or 
trait. Taylor (1934) reported in the 
same year, however, no relationship be- 
tween the speakers’ scores on a ques- 
tionnaire based chiefly on items from 
Thurstone’s Personality Schedule (Thur- 
stone & Thurstone, 1929) and the scores 
predicted for them by listener-judges. 
He did find significant interjudge agree- 
ment. Wolff (1943) reported both high 
agreement among listeners on matching 
voices with summary descriptions of 
personality, and significant agreement 
between these matchings and ratings by 


411 


the speakers’ friends, In an interesting 
variation, Wolff included the raters’ own 
voices among those to be judged. Only 
10.5% recognized their own voices. The 
“unconscious self-judgments” of the 
others agreed in general with the rat- 
ings given by other listeners, but they 
tended to judge each characteristic as 
more extreme or more obviously present 
than did others’ ratings. 

An attempt to define one aspect of 
how voices differ was made by Stark- 
weather (1955b, 1956b), who compared 
speakers with high and low scores on a 
personality test (Harris, 1953) which 
distinguishes hypertensives from nor- 
mals. His prediction that the hyperten- 
sive syndrome group would show greater 
incongruence between the verbal and 
nonverbal (roughly equivalent to tone of 
voice) aspects of their speech was not 
confirmed. His technique of removing 
those frequencies of the speech spectrum 
upon which word recognition depends 
(Fletcher, 1953; French & Steinberg, 
1947; Licklider & Miller, 1951) is dis- 
cussed below in the section on voice and 
changing emotional states, under “low- 
pass filtering.” The role of expert judg- 
ment on determining personality from 
voice was investigated by Jones (1942). 
He gave the Rorschach protocol from 
an adolescent boy to a well-known Ror- 
schach analyst, Piotrowski, and gave a 
recording of the boy’s voice to Moses, 
a laryngologist who has had a strong 
interest in the interrelationships of voice 
and personality (Moses, 1941, 1942, 
1954). The two independent analyses 
were considered to match well with each 
other, Moses (1942) has described from 
aclinical point of view the 21 voice vari- 
ables he considered in making his 
analysis. 

Personality adjustment and psycho- 
pathology. Although practicing clini- 
cians have been well aware of the im- 
portance of the nonverbal aspects of 
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voice for problems of diagnosis and 
therapy (Shakow, 1959; Soskin, 1953; 
Sullivan, 1954), few experimental stud- 
ies have been done. In the milder range 
of adjustment problems, Duncan (1945) 
had speakers rated for voice quality by 
fellow speech students after 3 weeks in 
class. Of the 30 descriptive terms used 
in the ratings, 11 could be used to iden- 
tify whether the speaker had been low 
or high in his social adjustment score 
on the Bell Inventory (Bell, 1934). No 
cross-validation of the discriminating 
terms was reported. Monotonism seems 
likely to affect speech patterns; Ramm 
(1946) found that 25 fifth graders with 
monotonism showed inadequate social 
and emotional adjustment on a number 
of personality tests, especially the Ror- 
schach. The lack of a control group in 
her study makes the criterion value of 
the Rorschach scores particularly ten- 
tative, since there is no adequate norma- 
tive data on children's Rorschach per- 
formances as an indicator of general 
personality adjustment. 

A study on schizophrenic children 
(Goldfarb, Braunstein, & Lorge, 1956) 
reported that these youngsters, compared 
with a normal group, were ineffective in 
conveying mood or emotion vocally, 
giving the effect of either no emotion at 
all or one which was inappropriate to 
the verbal content. Cohen (1961) found 
that naive judges could not separate 
schizophrenics from normals on the basis 
of voice quality and speech pattern. 
Experienced judges were used by Mosko- 
witz (1952), but her report on the diag- 
nostic significance in schizophrenia of 
"monotonous, weak, gloomy, and un- 
sustained" voices seems chiefly a re- 
minder of the difficulty in finding an 
adequate language for describing voice. 

A possible way out of that difficulty is 
suggested in the work of Ostwald 
(1960a, 1960b). He has done speech 
spectrum analyses of the voices of many 


patients and compared these spectro- 
gram records with common terms for 
describing voices. Such comparisons 
should clarify which terms have some 
reliable meaning. He has also made 
some tentative suggestions (Ostwald, 
1960a) about the relationships between 
certain types of psychiatric disorders 
and the spectrum analyses of the pa- 
tients’ speech. 

The role of the expert judge is best 
illustrated in the work of Moses (1941, 
1942, 1954) and Jones (1942). His 
Voice of Neurosis (Moses, 1954) pre- 
sents most fully the foundations and im- 
plications of his belief that “voice is the 
primary expression of the individual, and 
even through voice alone the neurotic 
pattern may be discovered" (p. 1). He 
stresses the need for using different 
frames of reference in different social 
and linguistic groups. Although he has 
attempted to set down the relevant voice 
variables and his method of judging 
them in the most objective manner pos- 
sible, a large part of his work remains 
exclusively the analysis of the single 
expert. Some of his voice variables do 
seem quantifiable in terms which could 
be investigated in the laboratory; for 
example, “range” as the range of the 
fundamental frequencies used, and 
"rhythm" as a stress pattern of the 
changes in amplitude over time. Other 
categories, however, would need much 
redefining before they could be sub- 
mitted to close experimental, laboratory 
investigation. Despite this, Moses' clin- 
ical acumen and experience are impor- 
tant in an area marked by the inade- 
quacy of experimental studies. 


Voice AND CHANGING EMOTIONAL 
STATES 


Soskin (1953) has described vocal 
communication in terms of two chan- 
nels, each specializing in a different type 
of information. “Semantic information” 
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is carried in the channel consisting of 
the articulated patterns of sounds which 
we recognize as words and sentences. 
“Affective information" is carried in the 
channel bearing the changing, nonverbal 
features of the voice. 

The studies presented here on how 
successfully and under what conditions 
emotion is conveyed by the nonverbal 
parts of the voice are grouped below 
according to the method used for remov- 
ing the influence of the particular words 
spoken—the “semantic” channel—on 
the judgments of emotion. A grouping 
analogous to the previous section, where 
studies would be classified according to 
the emotions investigated, would make 
the unwarranted assumption that a label 
such as "grief" or *joy" meant the same 
thing to all experimenters who used it. 
In fact, no test of the purity of the 
emotional categories seems to have been 
done by any experimenter. In the range 
of studies, each covering many and vari- 
ously labeled emotions, one man's joy 
may come close to being another man's 
poison. 


Meaningless Content 


The chief forms of “meaningless con- 
tent" have been numerals and letters 
of the alphabet; however, the earliest re- 
ported study of this type (Skinner, 
1935) requested subjects to say merely 
“Ah.” They were requested first to 
read a passage of emotional literature 
and listen to selected music that was de- 
signed to put them into a sad or happy 
emotional state. They then spoke, and 
the ah's of happiness showed higher 
pitch and greater force than those of 
sadness, Dusenbury and Knower (1939) 
asked a group of speech students and 
instructors to “try to feel the designated 
emotional state and to use a tonal code 
that would indicate their feelings” 
(p. 67) while reciting the letters A 
through K. Eleven emotions were 
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designated, and 8 sets of recitations 
were selected out of an initial 22 on the 
basis of pretests. All of these eight sets 
were matched with the emotion which 
the speaker had tried to represent with 
significantly greater than chance accu- 
racy. In a later study (Knower, 1941) 
the speakers whispered the letters in 
an attempt to eliminate the effects of 
“tone,” since the fundamental frequency 
of the voice is not present in a whisper. 
Recognition of the intended emotions 
was still better than chance. These 
studies unfortunately cannot confirm 
whether these speakers, or any others, 
would use these same “tonal codes” 
when experiencing the emotions in real 
life situations. 

Similar material has been used for 
teaching psychotherapy (Thompson & 
Bradway, 1950). Two psychologists 
acted out a “therapeutic interview” in 
which they actually spoke only num- 
bers, although with the inflections which 
a genuine exchange between patient and 
therapist might have. The two partici- 
pants each listened separately to re- 
cordings of the interview and made 
statements about the “affective inter- 
change.” The two sets of statements 
were found to be significantly correlated 
with each other. Numerals also formed 
the only verbal content in a study by 
Pfaff (1954) in which an “experienced 
speaker” used numbers to express a 
variety of emotions. Out of various 
groups of listeners, junior high school 
students of low socioeconomic status 
did least well at identifying the emo- 
tions being portrayed, which college 
oral interpretation students did best. A 
partial explanation of the results may 
be that the “experienced speaker” drew 
from the same stock of stereotypes and 
stage techniques with which the oral 
interpretation students were most fa- 
miliar. The low ranking of the junior 
high school students of low socioeco- 
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nomic status suggests the hypothesis 
that the “tonal affect language" may be 
different, at least to some extent, for dif- 
ferent classes in a society. Inexperienced 
readers from a Navy ROTC group were 
used by Black and Dreher (1955). 
They read a list of five syllable phrases 
in a manner aimed at simulating cer- 
tainty or uncertainty. Listeners, pre- 
sumably of the same background as the 
speakers, were able to distinguish be- 
tween the two types of reading with 
high reliability. 

Recitations of the alphabet with vary- 
ing expressions were used by Davitz and 
Davitz (1959a, 1959b) in work which 
focused on errors and relative ease in 
guessing various emotions. In their first 
study (1959a) all feelings were identi- 
fied more consistently than chance alone 
would predict, but certain errors in iden- 
tification showed significant consistency. 
When fear was mistakenly identified, 
it was most commonly taken for nerv- 
ousness; love was most commonly mis- 
identified as sadness, and pride for satis- 
faction. In their second study (Davitz & 
Davitz, 1959b), they selected from pre- 
tests two speakers who were particularly 
successful at communicating feelings 
through reciting the alphabet. These 
speakers each used the alphabet to ex- 
press 50 different feelings. Thirty 
judges tried to match each recitation 
with the correct feeling. Ten judges 
rated each feeling from the list of 50, 
checking each feeling on the list which 
was similar to it. A similarity score was 
based on the number of times a feeling 
was noted as being similar to any other 
feeling. Separate groups of judges rated 
the feelings on the list in order to de- 
rive “strength,” “activity,” and “va- 
lence” scores for them. Findings were: 
(a) accuracy of identification of feelings 

was correlated —.29 with similarity 
scores (significant at the .025 level); 
(b) the degree to which one feeling is 


mistaken for another is related to the 
subjective similarity of the two; (c) for 
pairs of similar feelings, the stronger 
tends to be communicated more ac- 
curately; and (d) no significant rela- 
tionships appeared involving the activity 
or the valence scores. The authors note, 
however, that “since the relationships 
found were not high, the greater part 
of the variance in accuracy of commu- 
nication is unaccounted for" (p. 116). 

The longest units of "meaningless 
content" were the eight neutral sen- 
tences used by Pollack, Rubenstein, and 
Horowitz (1960a, 1960b); eg. “The 
lamp stood on the table." The sentences 
were read in various modes under in- 
creasing signal:noise ratios. The au- 
thors found that recognition of the emo- 
tional modes intended held up better 
under noise than did recognition of the 
particular sentences. Mode recognition 
was also possible with significant ac- 
curacy even when the fundamental fre- 
quency was eliminated by having the 
sentences whispered. Effects of temporal 
sampling were also explored; some 
recognition of the modes was still pos- 
sible with extremely short samples and 
with sections of the recorded samples 
removed at periodic intervals. 


Constant Content 


Another technique for eliminating the 
effects of the words spoken has been to 
use the same set of words for all the 
emotions represented—words which are 
ambiguous as to emotional content, 
rather than meaningless in the sense 
that numerals or separate letters are. 
Like the use of meaningless content, this 
method requires the use of actors—ex- 
perienced or inexperienced—and thus 
may capitalize on the presence of stereo- 
typed representations of emotions which 
might not occur naturally. Fairbanks 
(1940; Fairbanks & Hoaglin, 194; 
Fairbanks & Pronovost, 1939) had ex- 
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perienced student actors read five pas- 
sages, each marked by a different emo- 
tion. Listeners heard only a set of 
sentences that was common to all five 
passages; they were asked to identify 
the emotion being represented on the 
basis of these excerpts. Some of the ac- 
tors seemed to provide much clearer vo- 
cal differentiation of emotion than did 
others. Measurable pitch differences 
(Fairbanks & Pronovost, 1939) and 
differences in duration of phrases (Fair- 
banks & Hoaglin, 1941) were found 
among the different emotions, using 
average measures from the various read- 
ings. Fairbanks appears to have assumed 
that his common passage was ambiguous 
or neutral for the five emotions he 
studied. The present writer, in an un- 
published study, presented the passage 
in typescript to a group of raters and 
found it significantly most frequently 
identified as anger. Expressing such a 
passage in various emotional styles may 
have required an exaggeration of pitch 
and duration characteristics which would 
not be found in natural emotional 
speech. 

A standard passage was read by 
“tired” and “rested” speakers in an at- 
tempt to see if listeners could distinguish 
between the two groups (Fay & Middle- 
ton, 1940a). The rested speakers had 
their normal amounts of sleep, while 
the tired group had gone without sleep 
for 30 hours. They were all free from 
speech defects, “nor, in the opinion of 
the writers, did any of the speakers pos- 
sess voices noticeably lacking in vitality” 
(p. 646). This opinion seems not to 
have corresponded with the perceptions 
of the listeners, whose accuracy in de- 
ciding whether the speaker was tired or 
rested proved to be significantly worse 
than chance. The authors feel that “the 
existence of stereotyped tired and rested 
voices” (p. 649) accounted for the re- 
sults. 
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Ignoring Content 


Brody (1943), attempting to focus 
his attention away from the patients’ 
words, has called attention to subtle 
variations in voice during the course of 
psychoanalysis. He presented several 
cases in which vocal changes seemed to 
mark transitions between major emo- 
tional stages in therapy. He regards 
vocal expression as a relatively safe way 
to act out hostile feelings during analy- 
sis. Verbal content may be “ignored” 
by measuring specific nonverbal proper- 
ties of speech. Different aspects of 
speech rate and breathing rate were 
measured in a series of studies on non- 
content aspects of speech during 
psychotherapy (Goldman-Eisler, 1955, 
19562, 1956b). Rate of respiration and 
expulsion rate of syllables were corre- 
lated with a single psychiatrist's judg- 
ment of the emotional content of the 
patients’ communications. 

Experimental attention has been given 
to the role of silences and disturbances 
in speech as indicators of changing emo- 
tional states in psychotherapy (Dibner, 
1956; Mahl, 1956, 1959). Mahl (1959) 
has sometimes referred to these as “ex- 
pressive aspects of . .. speech,” but 
they have chiefly been treated as dis- 
ruptions in the speech process rather 
than as part of the simultaneous non- 
verbal accompaniment to spoken words 
which is the center of focus in the pres- 
ent review. Krause (1961) has shown 
that Mahl’s measures are highly simi- 
lar to those used by Dibner (1956) as 
indicators of anxiety. Krause and 
Pilisuk (1961), using measures from 
both Mahl and Dibner, found that “in- 
trusive nonverbal sounds, mainly laughs 
and sighs” were the best speech indica- 
tors of transitory anxiety. 

Linguistics has offered some schemes 
of analysis which consider nonverbal, 
or nonlexical, aspects of speech (Pitten- 
ger & Smith, 1957; Trager, 1958). 
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These aspects, *vocalization and voice 
qualities together are being called para- 
language” (Trager, 1958, p. 4). Mc- 
Quown (1957) used these recently de- 
scribed categories in a detailed analysis 
of a single interview, where he charac- 
terized the affect involved; no inde- 
pendent measure of the affective aspects 
of the interview is reported. A highly 
detailed booklength analysis in these 
terms has been made of the first 5 min- 
utes of a psychiatric interview (Pit- 
tenger, Hockett, & Danehy, 1960). In 
a study by Eldred and Price (1958) 
four judges listened to tape recordings 
from various parts of intensive psycho- 
therapy with a single patient and noted 
which linguistic and which emotional 
categories seemed to vary together. No 
cross-validation of the suggested rela- 
tionships was reported. Some doubt has 
been cast on the utility of such ap- 
proaches by a study using a larger num- 
ber of judges (Dittman & Wynne, 
1961). It was found that more tradi- 
tional linguistic categories, such as junc- 
ture, stress, and pitch, could be reliably 
coded but bore no relationship to the 
emotional state of the speakers. Paralin- 
guistic categories were expected to be 
more emotionally relevant, but they 
could not be reliably coded. 


Acoustical Methods 


Filtered speech. The studies noted 
here have eliminated verbal content by 
passing recorded speech through a low- 
pass filter designed to hold back those 
higher frequencies of speech upon which 
word recognition depends, It is expected, 
in such studies, that many of the non- 
verbal aspects, such as stress patterns 
and intonation patterns based on funda- 
mental frequency, still remain. Little 
notice has been taken of the nonverbal 
information which is lost together with 
the verbal, although an acoustical study 
(Ochai & Fukamura, 1957) has shown 


that the upper as well as the lower over- 
tones of speech contribute to the per- 
sonal tone or timbre of a person’s voice. 
Controls have also been lacking on the 
effectiveness of the filtering in eliminat- 
ing content; the available material on 
frequency and intelligibility (Fletcher, 
1953; French & Steinberg, 1947; Lick- 
lider & Miller, 1951) does not deal with 
intelligibility in connected discourse. 
Fifteen speech samples were rated for 
emotional content by a group which 
heard them on normal recording, and 
separately by a group which heard them 
on recordings from which frequencies 
above 450 cycles per second had been 
removed (Soskin & Kauffman, 1961). 
The two sets of ratings showed signifi- 
cant agreement. Kauffman (1954) had 
a professional actor record two readings 
of a series of short speeches. In one 
reading the actor read with an emotional 
expression which was appropriate to the 
words of each speech, while in the other 
reading he used an expression which 
was highly incongruent with the verbal 
content. The recordings were passed 
through a low-pass filter to remove the 
semantic content. One group of listeners 
judged the second series of speeches for 
incongruity by comparing the filtered 
recordings with typescripts of the 
speeches, Separate groups rated the 
typescripts alone, and the full range and 
filtered recordings. Kauffman classified 
the “meanings” of the rating categories 
into two main divisions: (a) expressive, 
“affect meanings relevant to the psy- 
chological state of the speaker," and 
(b) “manipulative . . . meanings rele- 
vant to the purposive behavior of the 
speaker." He found that both the vocal 
and verbal channels, corresponding to 
the affective and semantic channels of 
Soskin (1953), carry information about 
both the expressive and manipulative 
meanings in speech. There is, however, 
a tendency for the expressive f unction to 
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be performed by the vocal channel and 
the manipulative by the verbal. Incon- 
gruence between vocal and verbal chan- 
nels was reflected in greater hetero- 
geneity of judgments, particularly in the 
judging of expressive meanings by those 
who heard only the filtered recordings. 
Heterogeneity of judgments was assumed 
to be a measure of ambiguity. There 
was, then, a consistent negative corre- 
lation between the degree of congruence 
of the vocal and verbal channels and 
the amount of ambiguity. 

Starkweather (1955a, 1956a) sampled 
recordings of the 1954 Army-McCarthy 
hearings for three excerpts each, of the 
voices of McCarthy and Welch, and fil- 
tered out content by attenuating higher 
frequencies. "Twelve clinical psycholo- 
gists showed high interjudge agreement 
on rating emotional content although 
they insisted that they had no confidence 
in their own ratings. The raters were 
then given a normal, unfiltered presenta- 
tion of the excerpts and asked to rate 
them again. A comparison of the cate- 
gories assigned to the filtered and unfil- 
tered recordings indicates that Welch’s 
voice was judged appropriate to the ver- 
bal content, while McCarthy's voice was 
judged to be without variation. 

Other acoustic techniques. Some pos- 
sible experimental approaches to dura- 
tion and other physical aspects of the 
speech signals which have not yet been 
directly used in voice and emotion stud- 
ies have been noted (Starkweather, 
1961). These may offer better ways of 
quantifying some of the dimensions of 
speech which change with changing emo- 
tions. Hargreaves and Starkweather 
(1961a) present a case study where 
judges were able to use certain aspects 
of speech spectrograph records to iden- 
tify changes in a patient's vocal be- 
havior which had been considered sig- 
nificant by her therapist. The validity 
of the *machine method" of identifying 
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emotionally significant vocal changes 
still rests on the validity of the skilled 
listener who sets the criterion dimensions 
for it, but the authors feel that the 
method offers a great saving in effort 
over having a skilled listener consider 
separately every section of vocal be- 
havior in an interview. Using set aspects 
of the spectrogram also avoids the effects 
of fatigue and the learning of wrong 
cues which might mar the judgments of 
the skilled listener alone. The authors 
point out that much information is 
present in the spectrogram, and a differ- 
ent selection of dimensions might have 
served as well as theirs for finding 
correlates to emotional changes in the 
patient. It may be necessary to use 
different dimensions for different indi- 
viduals, as Krause (1961) has found 
different behavioral measures of vocal 
behavior to be important for subjects in 
identifying anxiety. 


CONCLUSION 


The abstract at the head of this re- 
view has summarized the main trends 
in the experimental studies on judging 
personal characteristics from voice. 
Twenty years ago, Sanford (1942) noted 
that common experience seems to accept 
the existence of connections between 
voice and personality, and if "the 
analytic-experimental approach . . . re- 
veals no relationship, we should be 
forced to conclude that it may be the 
fault of the approach" (p. 838). Many 
details still remain to be explored (Kra- 
mer, 1962), but the “analytic-experi- 
mental approach" has, by now, verified 
that such relationships exist. 
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THE EXPERIMENTER: 


A NEGLECTED STIMULUS OBJECT' 


F. J. McGvroAN ? 
Hollins College 


In single E experiments it is not possible to generalize to a population 
of Es. There are 3 cases for multi-E experiments: In Case I different Es 
do not differentially affect the results; in Case II one E obtains higher 
scores for all groups than does a second E; and in Case III E character- 
istics interact with treatment conditions. Results are potentially general- 
izable for Cases I and II, but not for Case III. Evidence indicates that 
multi-E experiments are common, but that reports of procedure and 
results for different Es are almost nonexistent. It is essential that we 
explicitly attempt to generalize to a population of Es; specify techniques 
of controlling the E variable; and accumulate knowledge in a variety of 
experimental situations about the effects of Es on their Ss. 


To say that behavior is a function of 
a fantastically large number of stimulus 
variables is to understate the immensity 
of the problem facing the psychologist. 
Clearly, the sustained laboratory dissec- 
tion of our environment has produced 
considerable information about the rela- 
tionship between behavior and a number 
of classes of stimulus variables, but just 
as clearly much more remains to be ac- 
complished. In assessing our status, it 
is well to emphasize the presence of one 
particular stimulus object of the complex 
environment in which we immerse sub- 
jects—the experimenter himself. While 
we have traditionally recognized that the 
characteristics of an experimenter may 
indeed influence behavior, it is important 
to observe that we have not seriously 
attempted to study him as an independ- 
ent variable. Rather, we have typically 
regarded the experimenter as necessary, 
but undesirable, for the conduct of an 
experiment. Accordingly, in introduc- 
tory textbooks on experimental psychol- 


1 Modification of a paper presented at the 
American Psychological Association meetings, 
1961, in a symposium entitled “The Social 
Psychology of the Psychological Experiment.” 

2 The author expresses appreciation to Sher- 
man Ross for his valuable suggestions concern- 
ing the presentation of this paper. 


ogy we provide prescriptions for con- 
trolling this extraneous variable; but 
seldom do we consider the experimenter 
variable further, and the extent to which 
we actually control it in our experimenta- 
tion can be seriously questioned. As doc- 
umentation for this statement, consider 
some findings based on an analysis of 
37 usable articles from three recent issues 
(selected at random) of the Journal of 
Experimental Psychology. These articles 
were classified according to the number 
of possible data collectors and number 
of authors. In Table 1 we can see that 
10 of the 37 articles had only one possi- 
ble data collector. It is reasonable to 
assume that at least a majority of the 
other 27 experiments employed more 
than one data collector. In no article 


TABLE 1 


NUMBER or Possisre Data COLLECTORS IN A 
SAMPLE OF ARTICLES FROM THE JOURNAL OF 
EXPERIMENTAL PSYCHOLOGY 
———————— 


No. of No. of No. of possible 
authors articles data collectors 
1 2 "XE 
eee ee Be RC S 
1 16 10 3) Rive 
2 17 (ee C AN ee | 
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Total 37 {0 37 73, 3 
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was any mention made of techniques of 
controlling the experimenter variable and 
in only one of the articles was the num- 
ber of data collectors actually specified. 
Furthermore, in no article was a statisti- 
cal analysis of results as a function of 
experimenters reported. It seems quite 
clear that we are deficient in the write-up 
and analysis, if not in the design of our 
experiments as far as the experimenter 
variable is concerned. The possibility is 
alarming that in multidata collector ex- 
periments adequate control is not exer- 
cised. Especially is this so for those 
psychologists who have witnessed in 
amazement the conduct of experiments 
by some of their colleagues in which one 
imenter collects data for awhile, 
after which he is relieved by another ex- 
perimenter, with no plan for balancing 
the subjects in the groups over the 
experimenters. Such an experiment is 
totally indefensible. But it is, optimisti- 
cally, assumed to be relatively rare. 
Where pains kave been taken to control 
the experimenter variable in multiexperi- 
menter experiments, it is unreasonable 
to request that results be presented as a 
function of experimenters. This request 
has three bases: (a) it will justify the 
control procedures used, (5) it will help 
indicate the extent to which the results 
are generalizable to a population of ex- 
perimenters, and (c) it will provide 
much needed information on the extent 
and nature of the experimenter's influ- 
ence on the subjects. Point a needs no 
further elaboration. But Points b and c 
can profitably be developed. 


SAMPLING FROM A POPULATION OF 
EXPERIMENTERS 


Assume that in a given experiment it 
was possible to control the experimenter 
variable in a completely adequate fashion 
by holding that variable constant. This 
means that the numerous stimuli emanat- 
ing from the experimenter-stimulus ob- 
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ject have assumed the same 
but unspecified, value for all the 
throughout the experiment. 
the intensity and other values of thi 
experiment-produced stimuli, we are 
suming that they have not diff 4 
affected the behavior of the subjects. 

Clearly such a technique of controlling. 
the experimenter variable is not practi- 
cal. But that is not the worst of it. For 
controlling any variable by holding it 
constant is only defensible in the long 
run if the one experiment concerned €x- 
hausts the universe of investigations on 
the problem posed. And never would a 
universe of experiments be limited to one, 
Hence, let us consider that our hypo- 
thetical experiment in which the 
menter variable is held constant is re- 
peated by another experimenter, one who 
takes pains to duplicate all of the con- 
ditions of our experiment that have been. 
specified. And there is the rub. In the 
original experiment we have held the 
experimenter variable constant, but it 
simply was not possible to specify the 
intensity and other values of that com- 
plex variable. While in the replication 
of our hypothetical experiment we may 
assume that the experimenter và 
was similarly held constant, it is also safe 
to assume that it was held constant at 
different values than obtained in our 
original experiment. 

Let us take a particular, measurable, | 
characteristic of the experimenter as an 
illustration. Suppose that the experi- 
menter in one experiment manifests what. 
we call a high degree of anxiety, whereas 
the second experimenter has a low degree 
of anxiety. We can well expect that the 
stimuli emitted by these two experiment- 
ers will either be different in nature, OF 
in value. Will these two classes of stim- 
uli differentially affect the dependen 
variable measures of the subjects in the 
two experiments? This question gives US 
the opportunity to make sure that we are 
in agreement with respect to the place 0 
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the experimenter variable in psychologi- 
cal research. 

The problems arising in the sampling 
of subjects for experimentation have re 
ceived considerable attention—the un- 
dergraduate psychology major who is 
not aware of the mechanics of obtaining 
a random sample from a well-defined 
population of subjects is probably a rare 
specimen. While the way was paved 
some years ago, particularly by Bruns- 
wik (e.g., 1947), however, the same can- 
not be said with regard to other popula- 
tions relevant to experimentation. Bruns- 
wik emphasized the importance of 
sampling stimulus populations, but rarely 
are such populations actually systemati- 
cally sampled in psychology today— 
especially is this true of the subclass of 
stimulus variables emitted by the ex- 
perimenter who faces the subjects. On 
any given problem, we could define a 
population of experimenters, although 
admittedly not easily in an unambiguous 
fashion. In our conduct of an experiment 
on that problem, then, strictly speaking 
we should employ a design (such as a 
complete factorial design) that allows 
us to vary experimenters—we should 
randomly sample from a population of 
experimenters and replicate the experi- 
ment for each experimenter used. 

Now let us return to our question: 
does the fact that two experimenters who 
differ only in regard to a single charac- 
teristic affect the performance of sub- 
jects in two otherwise identical experi- 
ments? There are three general answers 
possible. 

Case I. First, the stimulus charac- 
teristic in question is totally unrelated 
to the dependent variable being meas- 
ured. In this event essentially the same 
scores would be obtained by both ex- 
perimenters. Clearly in this case, we 
need not be concerned in the slightest as 
to whether or not experimenters in our 
hypothetical population differ—their re- 


Case II. The first two possible an- 
swers to our question do not greatly con- 
cern us. The third, however, can be 
rather i t. To take an extreme 


experimenters differ and the independent 
variable of the experiment. 

As an example of a Case III experi- 
ment, briefly consider an interaction 
reported by Kanfer (1958). Two experi- 
menters who had “minimal gross differ- 
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ences" participated in a verbal condition- 
ing experiment. The subjects were 
required to say words continually and the 
verbs that they emitted were reinforced 
by flashing a light according to one of 
three reinforcement schedules. The ex- 
perimenter's task was simple—to dis- 
criminate between verbs and nonverbs, 
and flash a light. The results indicated 
a significant Method x Experimenter in- 
teraction—there was more frequent rein- 
forcement of words for one schedule than 
for the others, the frequency varying for 
the experimenters. The experimenters 
evidently differed from each other in 
their ability to perceive verbs as a func- 
tion of reinforcement schedule. The rea- 
son for this seems obscure, but the lesson 
to the investigator is again driven home 
— if our results are a function of experi- 
menter characteristics, then they are 
highly specific and cannot be generalized. 
It should be emphasized that interac- 
tions involving experimenters may not 
only be unexpected, but quite obscure. 
In general we simply have not hai 
enough experience with experimenter in- 
teractions to know where to look for 
them. To further emphasize the obscu- 
rity of this type of interaction, consider 
some results from a study involving four 
methods of learning and nine experiment- 
ers (McGuigan, 1960). The analysis of 
variance indicated that there was a sig- 
nificant difference among methods but 
that experimenters did not differ, and 
particularly that the methods by experi- 
menter interaction was not significant. 
According to our normal procedure, we 
would conclude that the results with 
regard to methods is not a function of 
experimenters. But now let us study 
the interexperimenter variability more 
closely. We can note that there is con- 
siderable variability among experiment- 
ers for Methods P and VIW in Figure 1, 
but that there is relatively little inter- 
experimenter variability for Methods IW 
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tue (sec) 


EKPERIMENTEA NO. 


Fio. 1. Dependent variable scores for four 
methods plotted as a function of experimenters - 
(after McGuigan, 1960). ^4 
and W. The variance for each method. 
was computed and it was found that the 
differ significantly. Furthermore, the 
variability among the experimenters is & 
function of methods when methods are” 
ordered from P to VIW to IW and to W. 
In Figure 1 we arranged the experi- 
menters on the horizontal axis in & 
random fashion. Lines of best fit are ap- 
proximately parallel. In Figure 2, how- — 
ever, we have arranged the experimenters” 
according to intraexperimenter vari 
ity. Now lines of best fit appear to devi- 
ate rather markedly from being 2 
Particularly note that the relative pro-- 
ficiency due to the various methods is & 
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function of the experimenters. Here we 
have a single experiment replicated nine 
times. Suppose that we had conducted 
the experiment only once using, say, Ex- 
perimenter Number 9. This experi- 
menter yielded a clear set of results due 
to methods. But had we chosen Ex- 
perimenter Number 8, a different set of 
results would have obtained. And con- 
trast these results with those obtained 


recording the 
this case the solution is quite clear and 
we have long been aware of the problem 
— precise adherence to experimental pro- 
cedure as we report it in detail in our 
publications. That this principle is not 
strictly adhered to can be made mani- 
festly clear by conducting interexperi- 
menter analyses. The recent work of 
Azrin, Holz, Ulrich, and Goldiamond 
(1961) on operant conditioning of 
conversations by student experimenters 
indicated considerable variation in rein- 
forcement techniques as well as down- 
right distortion. Stories of violation of 
proper data collection procedures by 
graduate assistants are legion, if some- 
what suppressed. Analyses to determine 
differences among experimenters on de- 
pendent variable scores can serve to at 
least stimulate investigation of proce- 
dural problems in a given experiment. 
The second possible difference among 
experimenters in Figure 2 concerns what 
we might call personality characteristics 
of the experimenters—we have a possible 
ordering of experimenters along some 
personality dimension. The only ques- 


personality 
influenced a given 
Hints can thus be obtained that can lead 


the subject. E 
Experiments in which experimenters 
with different characteristics 
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that response by saying “good.” Two 
groups were used, a different experi- 
menter for each group. The two experi- 
menters differed in gender, height, 
weight, age, appearance, and personality: 

The first . . . was . . . an attractive, 
soft-spoken, reserved young lady . . . 5'4” in 
height, and 90 pounds in weight. The . . . 
second . . . was very masculine, 6'5" tall, 
220 pounds in weight, and had many oí the 
unrestrained personality characteristics which 
might be expected of a former marine captain 
—perhaps more important than their actual 
age difference of about 12 years was the dií- 
ference in their age appearance: the young 
lady could have passed for a high school 
sophomore while the male experimenter was 
often mistaken for a faculty member (Binder 
et al., 1957, p. 309). 


The results of this experiment are 
shown in Figure 3. We can see that the 
rate of emitting hostile words increases 
with trials for both groups—saying 
"good" reinforced the response for both 
experimenters. But of particular signifi- 
cance to us now is the fact that the rates 
of learning for the two groups differed 
significantly—the slope is steeper for the 
female experimenter’s group. Clearly 
the differences between the two experi- 
menters are numerous, so it is difficult 
to specify just what experimenter charac- 
teristic or combination of characteristics 
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Fic. 3. Learning curves for two groups 
treated the same except for experimenters. (The 
steeper slope for the subjects of the female 
experimenter illustrates an interaction involv- 
ing experimenters—after Binder et al., 1957.) 
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is responsible for this difference in learn- 
ing rate of the two groups. But this 
research is a promising start. A follow- 
up of it might be aimed at testing the 
anthors’ speculation as to the important 
difference: that the female experimenter 
“provided a less threatening environ- 
ment, and the Ss consequently were less 
inhibited in the tendency to increase 
their frequency of usage of hostile 
words” (Binder et al., 1957, p. 313). 
An interesting experiment by Spires 
(1960) is illustrative of how character- 
istics of the subjects can interact with 
perceived characteristics of the experi- 
menter. Spires selected a group of sub- 
jects high on the Hy scale of the MMPI 
and a second high on the P/ scale. The 
subjects entered the experimental situa- 
tion with one of two sets: the positive 
set was where the subject was told that 
the experimenter was a “warm, friendly 
person, and you should get along very 
well"; the negative set was where the 
subject was told that the experimenter 
may “irritate him a bit, that he's not 
very friendly, in fact kind of cold." This 
was a verbal conditioning study in which 
a class of pronouns was reinforced by 
saying “good.” An analysis of variance 
for Spire's results indicated that there 
was a significant difference between posi- 
tive and negative sets for the experiment- 
ers (the positive set leading to better 
conditioning), and that the interaction 
between set for the experimenter and 
MMPI score of the subject was signifi- 
cant. This interaction is illustrated by 
the learning curves shown in Figure 4. 
There we can see that the hysterics, who 
had a positive set for their experimenter, 
condition remarkably better than the 
other three groups. While apparently 
this is the only study which shows that 
a rather well defined personality charac- 
teristic of the subjects interacts with à 
perceived characteristic of the experi- 
menter, further investigation would un- 
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doubtedly yield additional interactions 
of this nature. 


CoNCLUSIONS 


1. Where one data collector is used in 
an experiment, the best that can be done 
is to attempt to hold his influence on the 
subjects constant. The results, in this 
instance, cannot strictly speaking be gen- 
eralized to the relevant population of 
experimenters—if they are, the generali- 
zation must be extended exceedingly 
cautiously, at best. While it is not possi- 
ble to adequately specify the experi- 
menter’s characteristics in the report of 
the experiment, it should be recognized 
that this inability does not remove the 
problem—it persists and for a worker 
in a particular area the question of gen- 
eralization from a single experimenter 
experiment can assume nightmarish 
proportions. 

2. Where more than one data collector 
has been used (a) techniques of control 
should be specified, (b) the data should 
be analyzed and reported as a function of 
experimenters, and (c) interactions be- 
tween experimenters and treatments 
should be tested. Should the results 
indicate that the experiment is an in- 
stance of Cases I or II, the results are 
generalizable to a population of experi- 


menters to the extent to which such a 
population has been sampled. Granted 
that completely satisfactory sampling can 
seldom occur, at least some sampling is 
better than none. And it is beneficial to 
know and to be able to state that, within 
those limitations, the results appear to 
be instances of Cases I or II. If the ex- 
periment turns out to be an instance of 
Case III, the extent to which the results 
can be generalized is sharply limited. 
One can only say, for instance, that 
Method A will be superior to Method B 
when experimenters similar to Experi- 
menter Number 1 are used, but that the 
reverse is the case when experimenters 
similar to Experimenter Number 2 are 
used. This knowledge is of course valu- 
able, but only in a negative sense since - 
we do not know what the characteristics 
of the two experimenters are—to under- 
state the matter, the interaction tells us 
to proceed with considerable caution. 

3. It is important to contribute to our 
general fund of knowledge of the experi- 
menter variable, for it is indeed small at 
this time. That this request to collect 
relevant data will not excessively burden 
us is indicated by the frequency with 
which more than one experimenter al- 
ready participates in an experiment (see 
Table 1; further, note that in a sample 
of 722 articles from journals concerned 
primarily with experiments, 48% had 
two or more authors [Woods, 1961]). 
Quite clearly we already have enough 
information to safely assert that interac- 
tions between experimenters and treat- 
ments do occur. But there is a paucity 
of data about their frequency of occur- 
rence as a function of type of experi- 
mental situation. By designing more 
experiments to test for differences be- 
tween experimenters and for interactions 
involving experimenters we may eventu- 
ally be able to handle the problems in- 
dicated in Number 1 above and by in- 
stances of Case III. 
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As with all other variables with which 
we are concerned, determining the effects 
of the experimenter variable is a long, 
energy consuming project. But we must 
face up to our task. Recognizing the 
enormity of this project, one can well 
ask whether or not there is a more effi- 
cient approach. The only other possibil- 
ity that occurs at present is to eliminate 
the experimenter from the experiment. 
For some problems that we study, this 
would be relatively easy, but it is hard 
to visualize how this could be accom- 
plished in other experiments. For in- 
stance, a number of completely auto- 
mated devices have been developed and 
successfully used in running rats—the 
subjects are never exposed to a human 
experimenter. Automation has also en- 
tered psychology at the human level, 
but in neither case is automation very 
general, and certainly it is not, stand- 
ardized. In a number of experiments 

it seems reasonable to have the sub- 
ject enter the experimental room and 
be directed completely by taped instruc- 
tions, thus removing all visual cues, 
olfactory stimuli, etc., emitted by the 
experimenter. If eventually the human 


experimenter is replaced by devices which 
automatically run the subject through 
his routine, we must be careful not to 
select values of stimuli emanating from 
these devices that themselves interact 
with the treatments that we are studying. 
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Evidence for involvement of cognitive factors makes it seem clear 
that the number, the frequency locations, and the widths of the critical 
bands which are operative in a given auditory task reflect to a sub. 
stantial extent the strategy of listening that is adopted by O for that 
particular task. Central modulation of sensory information is extensive 
enough to make unlikely the discovery, through psychophysical meth- 
ods, of a unitary peripheral process that remains stable despite changes 
in O's task, his information, and his aims. The value of a psychophysi- 
cal approach to peripheral sensory mechanisms depends upon the ability 
to specify, and then on the ability to isolate, the central contribution to 


O's response. 


Harvey Fletcher (1940) reported an 
experiment that established the psycho- 
physical study of auditory frequency 
analysis. His paper introduced the con- 
cept of the “critical band,” the band of 
frequencies over which the listener in- 
tegrates acoustic power, and his experi- 
ment provided a basis for quantitying 
the concept.  Fletchers experiment 
showed that only noise components in a 
narrow region about a pure tone are ef- 
fective in masking the tone. He found, 
in particular, that the amplitude re- 
quired for a tone to be just detectable 
remained constant, despite variations in 
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the width of a band of noise surrounding 
the tone, if these bands were wider than 
some critical value. For bands of noise 
that were narrower than the critical 
value, the amplitude of the just detecta- 
ble tone decreased as the width of the 
band of noise decreased. He also ob- 
served that the width of the critical 
band increased with its center fre- 
quency; the width of the critical band 
appeared to be approximately 7% of its 
center frequency over a large range 
(Fletcher, 1940). 

Fletcher was concerned with periph- 
eral aspects of the process of frequency 
analysis, with the frequency selectivity 
that is accomplished in the ear. Illustra- 
tive of this interest is his suggestion that 
the critical bandwidths at different cen- 
ter frequencies represent equal distances 
on the basilar membrane. Other investi- 
gations have followed in this tradition, 
including those of Schafer, Gales, Shew- 
maker, and Thompson (1950); Hamil- 
ton (1957); Zwicker, Flottorp, and 
Stevens (1957); Greenwood (1961a) ; 
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and Swets, Green, and Tanner (1962). 
Four of these studies, excepting that of 
Zwicker et al, confirmed Fletcher's 
basic result in experiments that were 
similar to his; two of them (Schafer 
et al, 1950; Swets et al, 1962) 
showed close quantitative agreement 
with Fletcher's estimate of the critical 
bandwidth. Zwicker et al. showed the 
existence of a critical band in several 
different kinds of psychophysical ex- 
periments: experiments on masking of 
noise by tones, on loudness summation, 
on detection of multiple tones, and on 
phase sensitivity. Although these dif- 
ferent experiments showed a dependence 
of critical bandwidth on frequency simi- 
lar to that observed by Fletcher, they 
have led, along with the replications of 
Fletcher's experiment by Hamilton and 
by Greenwood, to substantially greater 
estimates of the width of the critical 
band. The estimates of critical band- 
width obtained in these experiments are, 
on the average, 15-20% of the center 
frequency of the band. Zwicker et al. 
and, especially, Greenwood have pur- 
‘sued the suggestion that the critical 
bandwidth may be simply related to the 
anatomy of the ear. Greenwood (1961b) 
accepts as valid the larger estimates of 
critical bandwidth, and concludes that 
each critical band corresponds to one 
millimeter along the basilar membrane. 
Scharf (1961) has recently reviewed 
most of the studies having a focus on 
peripheral frequency analysis. 

By the mid-1950s, there became ap- 
parent several reasons for considering 
the role of central, or cognitive, factors 
in the process of auditory frequency 
analysis. In general, it seemed fruitful 
to view the auditory analysis system as 
including more than fixed sensory ele- 
ments, and to examine the way in which 
adaptive portions of this larger system 
adjust to different auditory tasks. The 
approach to the analysis process on this 
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different, but perhaps not entirely in- 
dependent, level of inquiry originated 
with the work of Tanner, Swets, and 
Green (1956). It is exemplified in the 
studies of Tanner (1956) ; Green (1958, 
1960, 1961); Veniar (1958a, 1958b); 
Swets, Shipley, McKey, and Green 
(1959); Green, McKey, and Licklider 
(1959); Creelman (1960); and Swets 
and Sewall (1961). In the following we 
review this research effort. After con- 
sidering its motivation, we discuss the 
two theoretical models that have been 
proposed to incorporate the action of 
central factors, and the experiments con- 
ducted to test predictions from these 
models. 


MOTIVATION FOR AN EXAMINATION OF 
CENTRAL FACTORS 


Not all of the reasons for assuming 
and investigating adaptive aspects in 
auditory frequency analysis are clear, 
for, to some extent, they derive from 
the spirit of the times. We shall, how- 
ever, attempt a brief characterization 
of this spirit, and we shall single out the 
more important specifics. 

Developments in Neurophysiological 
and in Psychological Theory. It is now 
a commonplace that much of psycho- 
logical theory rests implicitly on pre- 
vailing neurophysiological concepts. Be- 
fore, and for some time after, Fletcher’s 
work the conceptual nervous system was 
a passive system, one in which only 
energy changes at the receptor deter- 
mined what information was conveyed 
to the brain. In the 1950s neurophysio- 
logical theory adjusted to the rapidly 
accumulating evidence that sensory in- 
formation is fed into a central nervous 
system that is continuously active and 
organized, and that extensively modu- 
lates the sensory flow, in part through 
efferent control of receptors (Hebb, 
1949, 1955; Lashley, 1951; Sperry; 
1952). Experimental results in audition 
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which are illustrative of this newer con- 
ception are the finding that a cat's audi- 
tory nerve activity in response to a 
clicking sound can be inhibited by im- 
pulses aroused in the brain which pass 
out to the cochlea (Galambos, 1956), 
and the finding that the electrical re- 
sponse of the cat's auditory cortex to a 
click vanishes after a period of habitua- 
tion, reappears if the click is paired with 
a noxious stimulus, and disappears again 
when the click is no longer followed by 
the noxious stimulus (Galambos, Sheatz, 
& Vernier, 1956). 

Quite independent of the impetus 
from neurophysiology, several develop- 
ments within psychology have con- 
tributed to the recent concern for cen- 
tral factors in sensory processes. One of 
particular relevance, noted by Broad- 
bent (1962), is the research on the hu- 
man operator's ability to process infor- 
mation which has helped to make re- 
spectable again the concept of attention. 
Broadbent's paper describes recent ex- 
periments on listening to two voice 
messages simultaneously. The results of 
those experiments are in good agreement 
with the experimental results reviewed 
in the following. 

Specification of Central Factors in the 
Detection Process. Another circumstance 
that suggests that central factors play a 
part in the process of frequency selec- 
tivity is the finding that they play a very 
large part in even the simplest sensory 
detection task. In general, the detection 
of a fixed signal by a practiced observer 
depends critically on the observer's in- 
formation about the probability of signal 
occurrence, on his information about the 
values and costs associated with the 
various response outcomes, and on his 
detection goal (Swets, 1961; Tanner & 
Swets, 1954). Furthermore, a number 
of studies have shown that the observer 
performs better when he has more in- 
formation about the physical charac- 
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teristics of the signal, and that this in- 
formation affects the receptive system 
rather than the response system. One 
example comes from an experiment in 
which the observer was required to de- 
tect a tonal signal that could, at random, 
have either of two specified frequencies. 
Telling the observer the particular fre- 
quency occurring on a given trial, after 
the signal had occurred, had little, if 
any, effect; particular frequency infor- 
mation given before the trial improved 
the detection significantly (Swets & 
Sewall, 1961). In another experiment, 
relatively strong and weak tonal signals 
of constant frequency were presented at 
random. A higher proportion of correct 
responses was observed on the trials 
following a correct response to a strong 
signal than on the trials following a 
correct response to a weak signal. This 
result suggests that strong signals pro- 
vide better cues to the nature of the 
signal than do weak signals, and that the 
observer's perceptual set is correspond- 
ingly better after a strong signal 
(Shipley, 1959). Other studies have 
demonstrated that the observer's effi- 
ciency in detecting a given signal is 
markedly improved when cues to the 
nature of the signal, or aids to the ob- 
servers memory, are provided along 
with the signal. Apparently the observer 
has a residual uncertainty about the 
signal's frequency even when a single 
frequency is used throughout a long 
series of trials. His performance is more 
efficient when he is required to detect 
an increment in a continuous quite 
audible tone than it is when he is re- 
quired to detect a tone burst in the ab- 
sence of an immediate cue to frequency. 
Similarly, if the audible tone is pulsed, 
sometimes with the increment that con- 
stitutes the signal and sometimes with- 
out, this precise cue to the starting time 
and duration of the signal improves the 
observer's efficiency. Again, an im- 
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mediate cue to the signal's amplitude is 
beneficial. When all of these aids to the 
observer's memory of the salient char- 
acteristics of the signal are present, the 
human observer's performance falls only 
about 3 db. below that calculated for a 
mathematically ideal detector with com- 
plete knowledge of the signal; without 
these aids, the observer falls short of the 
calculated ideal by 12-15 db. (Swets, 
1961). 

Diverse Estimates of Critical Band- 
width. More specific to the problem at 
hand, an investigation of central factors 
in frequency selectivity holds forth a 
promise of accounting for the large vari- 
ability exhibited by various estimates of 
the critical bandwidth. The lack of 
agreement among the estimates, as noted 
above, has presented a problem. There 
seems to be no good reason for discard- 
ing any of the estimates as unreliable, 
particularly because each of them has 
company. The differences, however, 
have not been reconciled. It is possible, 
of course, for the estimates to vary while 
the critical bandwidth underlying them 
is stable. A fixed critical bandwidth 
might manifest itself differently as dif- 
ferent sensory tasks are posed for the 
observer; if, for example, the edges of 
the critical band have a gradual, rather 
than a steep, slope, then the effect of 
these edges on the experimental results 
would depend upon the frequency dis- 
tribution and the intensity of the stimuli 
used in the experiment. Furthermore, 
as Hamilton (1957) has pointed out, 
small differences in experimental proce- 
dure and filter characteristics can lead 
to large differences in estimated critical- 
band values even in experiments of the 
same type. Still another source of vari- 
ability in the estimates is the particular 
shape of the critical band that is as- 
sumed in the analysis; Swets et al. 
(1962) have shown that a single set of 
data can lead to various estimates of the 
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width of the critical band which corre- 
spond to various assumptions about its 
shape. Thus the variability observed 
might be attributable to differences in 
experimental procedure. It is also possi- 
ble, on the other hand, that some of the 
variability in the estimates of the critical 
bandwidth results from variation in the 
analysis system. Different estimates 
may accurately reflect what are, in ef- 
fect, different critical bandwidths, which 
have been adjusted to best suit the re- 
quirements of different auditory tasks. 

Obvious Need to Treat Certain Cen- 
tral Factors. Beyond the Zeitgeist, the 
collateral evidence for central factors in 
auditory frequency selectivity, and the 
possibility of reconciling various esti- 
mates of the critical bandwidth, is the 
clear evidence for the operation of cer- 
tain central factors in the selection 
process. If the observer suppresses 
masking noise outside the critical band, 
then he must suppress signals outside 
the band as well. If signal and noise 
were separable, noise would not be noise. 
Moreover, if the observer’s filter passes 
only the noise components in a band 
about 1,000 cps when he is expecting a 
1,000-cps signal, and if only noise com- 
ponents about 2,000 cps are effective 
when he is expecting a 2,000-cps signal, 
then the analysis system is somehow 
different under these two circumstances. 
At least one parameter of the system is 
adjusted in accordance with the ob- 
server’s expectations. The observer ap- 
parently controls in an intelligent way— 
that is, in accordance with information 
available to him, and his goals—the 
focus of his attention. Several empirical 
questions that are relevant to frequency 
selectivity follow immediately from this 
Statement. How rapidly can the ob- 
server adjust the center frequency of the 
critical band? With what precision? 
Can he adjust as well other parameters 
of the analysis system? Can he, for ex- 
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ample, control the width of the critical 
band? Can he control the number of 
different critical bands that are opera- 
tive at one time? The answers to ques- 
tions like these were the aims of the 
theoretical and experimental work re- 
viewed in the remainder of this paper. 


THEORY AND EXPERIMENT 
Single-Band Model of the Process 


In a first attempt to deal with central 
factors in the process of frequency selec- 
tivity, Tanner et al. (1956) developed 
what has been termed the single-band 
model of the process. In this model, as 
in Fletcher’s conception, the observer is 
viewed at any given time as integrating 
acoustic power over only a limited range 
of frequencies. The single-band model 
goes beyond the original hypothesis of 
the critical band in making explicit the 
fact that the observer controls the mo- 
mentary frequency location of this range. 
It is further asserted in the model, in 
order to be specific, that a change in the 
frequency location of the band of sen- 
sitivity is effected by sweeping the band 
through the intervening frequencies, and 
that the time required to make a change 
increases with the extent of the change. 
It may be noted that this model bears a 
strong resemblance to the familiar 
searchlight analogy of the process of 
attention. 

We review, in the following, four dif- 
ferent types of experiments that were 
conducted to test various predictions de- 
rived from this model. These experi- 
ments required the observer to detect a 
signal at an unexpected frequency, to 
detect a signal that was equally likely 
to be either of two specified frequencies, 
to recognize which of two specified fre- 
quencies was presented, and to detect 
signals composed of two frequencies. 

Detection of a Signal at an Unex- 
pected Frequency. In the first experi- 
ment Tanner et al. (1956) gave unprac- 


ticed observers several training sessions 
that employed only a 1,000-cps signal, 
of .15-second duration, in a continuous 
background of white noise. The four 
alternative, forced-choice method of re- 
sponse was used; that is, each trial con- 
sisted of four time intervals, exactly one 
of which contained the signal, and the 
observer indicated which interval he 
believed was most likely to have con- 
tained the signal. The proportion of 
correct responses, denoted P(c), was 
taken as the measure of performance. 

After the several practice sessions, un- 
known to the observers, the signal fre- 
quency was changed to 1,300 cps, and 
presented with the energy that had 
yielded P(c)=.65 for the 1,000-cps 
signal. For the 1,300-cps signal, P(c) 
was very nearly .25, the value represent- 
ing chance success. In later sessions in 
which the observers expected the signal 
to be 1,300 cps, and after they had 
heard this frequency without noise, the 
P(c) at the same signal energy was ap- 
proximately .65. The essentials of this 
experiment were repeated by Karoly 
and Isaacson with similar results: un- 
expected signals of 500 cps and 
1,500 cps, presented infrequently on 
randomly selected trials, led to signifi- 
cantly lower values of P(c) than did the 
expected signal of 1,000 cps.* 

Detection of One of Two Specified 
Frequencies. In another experiment 
(Tanner et al., 1956) the observer knew 
that the signal would be either of two 
specified frequencies, and that the two 
were equally likely to occur on a given 
trial, Again, the four alternative, forced- 
choice method of response was used. 
The observers chose the interval that 
they believed contained the signal; they 
were not asked to identify the frequency 
that had been presented. The effects of 
the uncertainty were assessed by com- 


2 A, J. Karoly and R. L. Isaacson, personal 
communication, 1956. 
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paring the P(c) obtained from groups 
of trials in which either frequency might 
occur, denoted P(c)1v2, with the average 
value of P(c) obtained when the ob- 
server knew that the signal would have 
just one of the two frequencies through- 
out a group of trials, denoted P(c);,2. 
The signal energies were adjusted so 
that P(c), and P(c)» were very nearly 
equal. 

According to the single-band model, 
P(c)ivz will decrease as the separation 
between the two frequencies increases, 
that is, as it becomes less likely that the 
band of sensitivity is centered at the 
signal frequency that is presented. 
P(c),y2 will reach a minimum when the 
separation between the two frequencies 
is sufficiently large that the observer is 
unable to shift the band of sensitivity 
from one to the other during the time of 
the signal. This minimum is specified 
under the assumption that the observer 
listens for the frequency that is pre- 
sented on one half of the trials, and that, 
when he listens to the frequency pre- 
sented, his probability of being correct 
is equal to that when the frequency is 
known; on the other one half of the 
trials, when the observer is listening for 
the wrong frequency, chance determines 
x perce is of a correct detection. 

us the minimum P(c),,.=(% 
P(c) + (2) 04). vy 

We shall reproduce here the results 
of this experiment since the publication 
in which they appear is not generally 
available. They are shown in Table 1. 
The left-hand column, labeled Af, shows 
the various frequency separations used 
in the experiment. The second column 
shows the frequencies used to produce 
each value of Af. Each entry in the 
table under P(c), is based on 700 
trials, 350 at each frequency. Each 
entry under P(c)ıvz is based on 250 
trials. The column headed Minimum 
P(c)iy2 shows the extreme prediction of 
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the single-band model. The column 
labeled Decrement shows the difference 
between P(c)iv2 and P(c)1,2. The right- 
hand column shows the duration of the 
signal. 

In three of the four cases represented 
in Table 1, P(c),,. approaches very 
nearly, at the larger values of Af, the 
minimum specified by the single-band 
model. In the fourth case, Observer 1 
at .3 second, there is the possibility that 
the experiment was not carried far 
enough, since the decrement from un- 
certainty has not reached a demon- 
strated maximum. There is also the sug- 
gestion in the data that the signal dura- 
tion influences the size of the decrement 
in a direction that is consistent with the 
model, but, since only one observer pat- 
ticipated at both signal durations, this 
conclusion remains tentative. 

Recognition of Two Specified Fre- 
quencies. Tanner (1956) applied the 
single-band model to frequency recogni- 
tion. In the recognition experiment, à 
signal having either of two specified fre- 
quencies occurred in the single observa- 
tion interval of each trial, and the sub- 
ject indicated which frequency he 
thought had occurred. Along with the 
presentation of a rationale for the fact 
that P(c) increases as the separation be- 
tween the frequencies increases from à 
very small to a larger value, Tannet 
made explicit the prediction from the 
single-band model that, at some value of 
frequency separation, P(c) would begin 
to decrease and would reach a specifiable 
minimum. The recognition would suffer 
at large values of Af because the ob- 
server could not, presumably, listen for 
both frequencies during the time of the 
signal. The data obtained in the experi 
ment on the detection of one of two 
frequencies suggested that, for .1-second 
signals in the vicinity of 1,000 cps, 
P(c) would begin to decrease at a Af of 


Observer 3 


Observer 2 


TABLE 1 
DETECTION or One oe Two SPECIFIED FREQUENCIES COMPARED WITH THE PREDICTION OF THE SINGLE-BAND MODEL 


Observer 1 
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approximately 100 cps and reach a 
minimum at Af=300 cps. 

The results of the recognition experi- 
ment were sufficiently variable that no 
very clear evidence for or against the 
prediction of the single-band model 
emerged. In general, the results were 
consistent with the prediction. For two 
short signal durations, .05 second and 
.1 second, P(c) at frequency separations 
of 300-600 cps approached the mini- 
mum specified by the model. For dura- 
tions of .5 second and 1.0 second, no 
decrease in P(c) was evident as Af in- 
creased, a result that is consistent with 
a capability of observing both frequen- 
cies, given sufficient time, 

Detection of a Signal Composed of 

Two Frequencies. In an instance of the 
fourth type of experiment relevant to 
v» gg the single-band model, Marill (1956) 
Pi ae t presented two frequencies together. The 
observer stated in which of the two time 
intervals on each trial he thought the 
compound signal occurred. Marill found 
that a compound signal with Af of 
40 cps led to a significantly higher P(c) 
than either of the two frequencies taken 
singly. In fact, this value of Af pro- 
duced perfect power summation. He 
found, also, that a compound signal with 
Af=600 cps showed no summation at 
all; the pair of frequencies was no more 
detectable than the more detectable 
member of the pair. This result is in 
agreement with the single-band model 
and with the results of the other three 
experiments described previously. 
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Multiband Model of the Process 


A different attempt to deal with cen- 
tral factors in the analysis process is 
represented in the multiband model pro- 
posed by Green (1958). This model was 
developed specifically for the type of 
detection task studied by Marill, the 
detection of a signal composed of two 
frequencies. According to this model, 
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the observer is capable of listening to 
any number of frequency bands at the 
same time. He selects the number and 
frequency locations of the bands to 
which he listens, and bases his decision 
on the linear combination of the outputs 
of the bands he has selected. The model 
predicts a degree of power summation, 
something less than perfect, even for 
large separations between the frequen- 
cies. 


Experimental Comparisons of the Single- 
Band and Multiband Models 


Detection of a Signal Composed of 
Two Frequencies. 'The multiband model 
is inconsistent with the data reported 
by Marill that show no summation for 
large values of Af. However, Green 
(1958) reports data that are consistent 
with the model. He presented com- 
pound signals consisting of all possible 
pairs of 500 cps, 1,000 cps, 1,832 cps, 
and 2,000 cps, and found very nearly 
the predicted amount of summation in 
every case. 

The conflict between the results ob- 
tained by Marill and those obtained by 
Green has not been resolved. There are 
two reasons, however, for placing more 
confidence in Green’s results. One is 
that they are in agreement with the 
earlier results of Schafer and Gales 
(1949) which also showed a degree of 
summation at large values of Af. The 
second reason is that they are in agree- 
ment with the results obtained by Green 
et al. (1959) in an extension of the 
study of compound signals to signals 
composed of 16 frequencies. The fre- 
quencies used were all multiples of 
250 cps. The use of 16 frequencies 
makes the difference between the summa- 
tion and the no-summation predictions 
very large. The data from this experi- 
ment match very closely the predictions 
of the multiband model. 

Detection of One of Two Specified 
Frequencies. The two models can also 
be compared with data on the detection 
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of one of two specified frequencies. Der- 
ivation of the quantitative predictions 
from the multiband model for this type 
of experiment would take us too far 
afield, but it will be intuitively clear that 
the multiband model predicts that a de- 
crement in P(c) will result from fre- 
quency uncertainty. According to the 
model, the observer who listens to bands 
at both of the possible frequencies will 
be listening to more noise power, but 
not more signal power, than the observer 
who listens to a single band. Further- 
more, the model leads to the prediction 
that P(c),,, will decrease as the fre- 
quency separation increases, until in 
terms of the model, the point of no 
overlap of the two bands is reached. 
The multiband model predicts values of 
P(c)i. for extreme values of Af that 
are somewhat greater than those pre- 
dicted by the single-band model. 

It may be recalled that the results of 
the detection of one or two frequencies 
given in Table 1 show three cases in 
close agreement with the single-band 
prediction. We can now observe that 
the fourth case, Observer 1 at a signal 
duration of .3 second, agrees more nearly 
with the prediction of the multiband 
model. (The P(c),y2 predicted from the 
multiband model for this case, Observer 
1 at .3 second, is .72. The multiband 
prediction for Observer 1 at .1 second, 
is .70; for Observer 2, .76; and for 
Observer 3, .58.) We note again, how- 
ever, the possibility that the observed 
P(c),y2 that agrees more nearly with 
the multiband prediction would have 
decreased further had the experiment 
been carried to larger values of Af. 

After the arrival of the multiband 
model on the scene to compete with the 
single-band model, other experiments 
were conducted to compare the predic- 
tions of these models for the detection 
of one of two specified frequencies. 
Veniar (1958a, 1958b) found that two 
of her observers matched the multiband 
prediction while the decrement exhibited 
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by the other two observers was even less 
than that predicted under the multiband 
assumption. Unfortunately, this experi- 
ment was not carried to large enough 
values of Af to establish that the decre- 
ment had reached a maximum. Swets 
et al. (1959) found that two observers 
matched the single-band prediction and 
a third observer matched the multiband 
prediction. Creelman (1960) found 
that, when P(c) for the two frequencies 
taken individually was approximately 
-90, all four of his observers yielded data 
in close agreement with the multiband 
prediction; when P(c) for the individ- 
ual frequencies was approximately .75, 
data from three of the observers agreed 
with the multiband prediction and the 
data from the fourth matched the single- 
band prediction. The same picture 
emerged when the experiment was re- 
peated at a higher level of background 
noise. Swets and Sewall (1961) found 
that the data from their three observers 
matched the  single-band prediction 
when the P(c) for the single frequencies 
was approximately .75, and fell in be- 
tween the two predictions at greater 
levels of signal strength. Swets (un- 
published), using signals that led to 
P(c) of approximately .90 when taken 
individually, found that both of his ob- 
servers yielded results in good agree- 
ment with the predictions of the multi- 
band model. 

A different kind of evidence that is 
relevant to the attempt to distinguish 
between the two models is provided by 
an additional analysis of the data of one 
of the studies cited. Swets et al. (1959) 
made a contingency analysis of their 
data to determine P(c) on a given trial 
for each of the four conditions that 
could have held on the previous trial: 
(a) same frequency, correct response; 
(b) same frequency, incorrect response; 
(c) different frequency, correct re- 
sponse; and (d) different frequency, in- 
correct response. It was found that if 
the same frequency were presented on 
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two successive trials, the observer was 
more likely to be correct on the second 
if he had been correct on the first; the 
average difference in P(c) between Con- 
ditions a and b was .06. If different fre- 
quencies were presented on two succes- 
sive trials, the observer was more likely 
to be correct on the second if he had 
been incorrect on the first; the average 
difference in P(c) between Conditions 
c and d was .12. This analysis provides 
support for the single-band model in its 
indication that the observer is differ- 
entially sensitive to the two frequencies 
at a given point in time, in particular, 
in the indication that the observer perse- 
veres to some extent in listening to a 
given frequency band. 


Present Status of the Single-Band and 
Multiband Models 


Summary of Results Presented Above. 
Both the single-band and multiband 
models are consistent with the studies 
of detection of an unexpected frequency. 
Concerning the detection of one of two 
specified frequencies, approximately six 
studies favor one model while approxi- 
mately one-half dozen favor the other. 
A contingency analysis of the data in 
this type of experiment, however, offers 
fairly strong support for the single-band 
model. Although the results of the study 
of recognition of two specified frequen- 
cies are not entirely clear, they are in 
general agreement with the single-band 
model. The single-band model also 
gains an advantage in this case by de- 
fault: the multiband model in its pres- 
ent form, with the assumption that a 
simple linear combination of the outputs 
of the sensitive bands takes place before 
the final detection stage, is not appli- 
cable to the recognition experiment. 
With respect to the detection of signals 
composed of two or more frequencies, 
there exist again conflicting results, but 
the general superiority of the multiband 
model for this type of experiment seems 
clear. 
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Task-Specific Results. We have com- 
pared the two models proposed to date 
with the results of different types of 
experiments under the tacit assumption 
that the models are thoroughly com- 
petitive, that experiments will single out 
one model as generally preferable to the 
other. This is a reasonable procedure to 
follow at first, for one of the models 
may indeed be disclosed as superior to 
the other for all of the experiments con- 
sidered. 'To persist in regarding the 
models as strong rivals, in the absence 
of such a clear result, would not be in 
keeping with the spirit of the explora- 
tion of central factors. If central factors 
play a significant role, if the analysis 
process is subject to intelligent control, 
or, to state it still another way, if the 
observer can vary his strategy of listen- 
ing to suit the requirements of different 
tasks, then we should examine the possi- 
bility that both of the models are satis- 
factory, but under different circum- 
stances. From this point of view, a 
comparison of the models with the re- 
sults of a variety of experimental tasks 
is undertaken to accentuate the relative 
strengths and weaknesses of the two 
models. We can note, as the outcome of 
the comparison, a tendency for the 
models to complement each other with 
respect to the several tasks we have 
considered. Let us briefly re-examine 
the evidence for task-specific results, 
with an eye toward the possibility that 
the observer is able to use either of the 
general strategies represented in the two 
models, 

If we adopt the language of our 
models, we would say that, if the signal 
is composed of 16 frequencies separated 
by 250 cps, the observer listens with as 
many bands as there are frequencies, or 
perhaps with a single band that is as 
wide as the range of frequencies (Green 
et al., 1959). If the signal is composed 
of two frequencies, the observer listens, 
for reasons that are not evident, either 
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with two bands (Green, 1958; Schafer 
& Gales, 1949), or with just one (Marill, 
1956). It appears that if the signal is 
either of two specified frequencies, in 
which case the more efficient procedure 
is not readily apparent to the observer, 
especially at low signal levels, some ob- 
servers listen with both bands, and 
others with just one (Creelman, 1960; 
Swets & Sewall, 1961; Swets et al, 
1959; Tanner et al, 1956; Veniar, 
19582, 1958b). 

Green (1960) described another ex- 
periment that suggests that the observer 
can adjust to the demands of the task. 
In this experiment the signal was a band 
of noise 650 cps wide. The center fre- 
quency of the band was varied, with the 
observer's knowledge, from one group of 
trials to another. It was observed that 
the detectability of the signal was 1m- 
dependent of its frequency location. 
This result would not be obtained if the 
observer were restricted at any given 
time to a single, classical critical band 
whose width is a fixed, increasing func- 
tion of frequency. If he were, the signal 
would increase in detectability as its 
center frequency increased, for he would 
be listening to a larger part of it. Green 
inferred from this result that critical 
bands are adjusted to be larger than 
their minimum widths when it is effec- 
tive to do so. 

It is also relevant to consider some 
fragmentary evidence that individual 
observers show different tendencies 1D 
the number of bands or in the width of 
the band that they employ, which per- 
sist from one task to another. Veniar 
(1958a, 1958b) pointed out that her 
Observer 1 was superior to her Ob- 
server 2 in detecting single frequencies; 
as if Observer 1 were listening to a nar- 
rower band of masking noise. Further- 
more, Observer 1 showed a large decre- 
ment when asked to detect one of two 
specified frequencies, while Observer 2 
exhibited a very small decrement in this 
task. Finally, Observer 2 performed sig- 
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nificantly better than Observer 1 when 
the signal was a wide band of noise. 
Similarly, Swets et al. (1959) found 
that the one of their three observers who 
matched the multiband prediction in de- 
tecting one of two frequencies (the other 
two observers matched the single-band 
prediction) performed very much better 
than the other two when the signal was 
composed of 16 frequencies separated by 
250 cps. 

Other Studies Relevant to the Two 
Models. The results of two additional 
experiments contribute to an evaluation 
of the two models that we have con- 
sidered. One of these experiments, on 
the reaction time in detecting one of 
two specified frequencies, is generally 
supportive for both models. The other, 
on the detection of one of a large num- 
ber of frequencies, weighs against the 
generality of both models. 

Swets conducted an experiment, simi- 
lar to those described above on the de- 
tection of one of two specified frequen- 
cies, in which reaction time was recorded 
as well as the percentage of correct re- 
sponses. The analysis of reaction times 
corroborates the major results observed 
previously in terms of percentage of cor- 
rect responses; namely, that a decrement 
in performance is produced by frequency 
uncertainty and that this decrement 
is related to the frequency separation. 
In particular, as the separation be- 
tween the two frequencies increased and 
as P(c),y2 decreased, the reaction time 
was observed to increase (Swets, un- 
published). This outcome is consistent 
with both of the models under discus- 
sion. This result is to be expected, in 
terms of the single-band model, because 
it is stated explicitly in this model that 
more time is required to observe at two 
frequencies than at one and that the 
amount of additional time required is a 
function of the difference between the 
two frequencies. The prediction for re- 
action time from the multiband model 
derives clearly from the model with the 
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aid of an intermediate step. According 
to the multiband model, the amount of 
effective noise entering the detection 
process increases as the critical bands 
that are adjusted to the two frequencies 
overlap less and less, and hence the ef- 
fective signal strength decreases. It is 
well known that reaction time is in- 
versely related to signal strength. 

Green (1961) studied detection under 
extreme conditions of frequency uncer- 
tainty: the signal could have any fre- 
quency between 500 and 4,000 cps. The 
decrement that resulted from such ex- 
treme uncertainty is only slightly larger 
than that which results from uncer- 
tainty about which of two specified fre- 
quencies will occur. Neither model 
comes close to predicting this result. 
Specifically, both models would predict 
a decrement in this case amounting to 
at least 10-12 db. of signal power, 
whereas the observed decrement is ap- 
proximately 3 db. 

On the Generality of the Two Models. 
We discussed earlier the particular 
limitations of each of the two models 
which indicated that they are, to some 
extent, complementary. The result of 
the study of extreme uncertainty re- 
veals that, taken together, they fail to 
represent adequately the operation of 
central factors in the analysis process. 
This result certainly weakens their posi- 
tion. However, to discard the two 
models because they are deficient in 
this single instance seems too severe an 
action, It would seem advisable to retain 
them for further consideration on a 
number of counts. They are, for one, 
still useful in yielding predictions in the 
range in which they apply. Moreover, a 
more adequate replacement has not yet 
appeared. Still more to the point, per- 
haps, is the fact that it is not yet en- 
tirely clear that now is the time to at- 
tempt to formulate a more general 
model, one that would replace both of 
these and also account for the wayward 
result, If we are dealing with a process 
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as versatile as that implied by the as- 
sumption of a large contribution from 
central factors, it would seem reasonable 
to consider an alternative, deliberately 
temporizing approach—an approach in 
which other simple, single-strategy 
models are formulated to supplement the 
ones we have—at least until the flow of 
data gives signs of lessening. 
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REWARD AND PUNISHMENT ASSOCIATED WITH 
THE SAME GOAL RESPONSE: 


A FACTOR IN THE LEARNING OF MOTIVES* 


BARCLAY MARTIN 
University of Wisconsin 


Research is reviewed which provides support for the thesis that, within 
certain boundary conditions, the association of punishment with a goal 
response during learning adds to the persistence of the response during 
extinction beyond the effects of reward-only during learning. Some 
theoretical considerations are offered to account for this phenomenon 
which make use of the constructs of anticipatory punishment responses, 
r»-5», and anticipatory reward responses, Tr-Sr. 


It is common to postulate the exist- 
ence of acquired or learned motives to 
account for those persisting patterns of 
purposeful human behavior that are as- 
sociated with no obvious primary rein- 
forcement, Some of this purposeful be- 
havior is no doubt rewarded by primary 
reinforcements about which we are pres- 
ently more or less ignorant, or is re- 
warded so intermittently that the pri- 
mary reinforcement escapes notice. When 
there is no primary reinforcement pres- 
ent and behavior persists, we are deal- 
ing, of course, with the phenomenon of 
resistance to extinction. Our current 
knowledge about antecedent learning 
conditions that affect resistance to ex- 
tinction of responses, although still in- 
complete, provides us with several em- 
pirical principles that are probably 
powerful enough to account for much 
of the purposeful yet innately nonre- 
warding behavior that we observe in 
ourselves and others. The resistance to 
extinction of responses learned under 
partial, or varied reinforcement, or var- 
ied stimulus conditions are examples of 
such principles. Another, and even more 
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striking example, is the almost nonex- 
tinguishability of avoidance responses 
learned after a few severe traumatic 
experiences (Solomon & Wynne, 1954). 

Tn this paper attention will be directed 
to a type of antecedent condition that 
has not received much experimentation, 
but which may contribute strongly to 
resistance to extinction; namely, the 
condition where reward and punishment 
are associated with the same goal re- 
sponse during learning. Naturally, one 
boundary condition that must be imposed 
is that the punishment has to be intro- 
duced in such a way that the goal re- 
sponse is not completely inhibited dur- 
ing acquisition. Somewhat paradoxically 
it is expected that punishment of the 
goal response in this way will lead to 
greater persistence at making this goal 
response during extinction than if re- 
ward only had been experienced. The 
remainder of the paper will be devoted 
to reviewing research relevant to this 
proposition, and then outlining a theo- 
retical formulation to account for this 
predicted phenomenon. 


RELEVANT RESEARCH 


Sears, Whiting, Nowlis, and Sears 
(1953) and Sears, Maccoby, and Levin 
(1957) report findings consistent with 
this expectation in regard to dependency 
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behavior in children. In both studies 
there is evidence that the more depend- 
ency responses, such as clinging to and 
resisting separation from mother, are 
punished the greater is the strength of 
these dependency responses, No matter 
how carefully performed, correlational 
studies of this kind provide only indi- 
rect evidence for the basic proposition 
since there is no experimental control 
over the antecedent conditions, Perhaps, 
for example, excessively dependent chil- 
dren cause their parents to be more 
punishing of dependency responses 
rather than vice versa. There have also 
been studies, including the two referred 
to above, which find that aggressive 
tendencies increase as punishment for 
aggression increases. Since aggression is 
a common response to punishment in 
general, maybe innately so to some ex- 
tent, the issue is considerably compli- 
cated in this case and, accordingly, stud- 
ies involving aggression as the goal 
response will not be reviewed. 

It may have occurred to the reader 
that associating reward and punishment 
with the same goal response is identical 
to the procedure employed by several 
investigators in an attempt to produce 
experimental neuroses (cf. Maier, 1949; 
Masserman, 1943). In most such stud- 
ies, however, the punishment was intro- 
duced in such a way as to completely 
inhibit or otherwise disorganize the be- 
havior; or the animal had no freedom 
to approach or withdraw in the situa- 
tion. Masserman (1943), however, does 
report “counterphobic” behavior in three 
cats which had received punishing air 
blasts in association with feeding. One 
cat, for example, at the feeding signal 
would run to the food box, insert his 
head beneath the lid, and then, instead 
of feeding, would remain immobile star- 
ing at the experimenter for prolonged 
periods. Maier (1949) was consistently 

able to produce responses in rats that 
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were extremely resistant to exti 
by confronting them with an insoluble 
discrimination problem in the Lashley 
jumping box apparatus. However, as- 
several investigators have pointed out, 
for example, Eglash (1954), the pun- - 
ishing air blast necessary to make the 
rat jump could continue to serve as a 
powerful motivator of a stereotyped 
avoidance response. At any rate, the 
presence of the initial air blast makes 
the situation too complicated to analyze 
in terms of reward and punishment as- 
sociated with the same goal response. 
There have been some studies, not 
necessarily oriented towards producing 
experimental neurosis, in which less dis- 
organizing punishments have been as- 
sociated with goal responses. Fisher 
(1955) using puppies as subjects ex- 
perimentally manipulated rewards and 
punishments associated with approach- 
ing an experimenter, a response roughly 
analogous to dependency or adult seek- 
ing behavior in children. He split lit- 
ters of puppies into four groups of six 
puppies each at 18 days of age and 
provided different social experiences for 
these groups until the puppies were 15 
weeks old. The two groups relevant to 
the present question were referred to as 
the Indulged group and the Punished In- 
dulged group. The puppies in the former 
group experienced 100% reward when- 
ever they approached the experimenter 
during half-hour sessions conducted five 
times per week throughout the training 
period. Reward consisted of being 
petted and fondled by the experimen- 
ter. The puppies in the Punished-In- 
dulged group experienced the five per 
week reward sessions just the same as 
the Indulged group but in addition also 
experienced five half-hour sessions per 
week in which they were punished every 
time they approached the experimenter. 
Punishment consisted of being switched 
or handled roughly, and during six ses- 
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being electrically shocked. A 

" test was conducted dur- 
ing the fourth, fifth, twelfth, and thir- 
teenth weeks of training and on the 
third and fifth days after completion 
of training. This test consisted of hav- 
ing a human sit quietly in the corner 
of a room while an observer recorded 
the amount of time the puppy spent 
near the human through a one-way 
mirror. Differences between the two 
groups were not significant at the fourth 
and fifth weeks; however, by the twelfth 
and thirteenth weeks and even more so 
on the third and fifth days after termi- 
nation of the experimental training 
periods, the Punished-Indulged group 
spent significantly more time near the 
human than did the Indulged group. 
These differences were due to relative 
increases in time for the Punished- 
Indulged group. The Indulged group 
continued to spend about the same 
amount of time near the experimenter. 
An important observational note is that 
two of the six puppies in the Punished- 
Indulged group spent no time whatso- 
ever near the human on the twelfth- 
and thirteenth-week tests, and in spite 
of this the group as a whole spent over 
twice as much time near the human 
than did the Indulged group in which 
there were no zero time scores. These 
two puppies, however, during the post- 
training tests did approach the human 
and spent more time near him than their 
litter mates in the Indulged group. The 
behavior of these two puppies points 
to the existence of strong avoidance 
tendencies along with the approach ten- 
dencies in these Punished-Indulged pup- 
pies. The fact that the Punished- 
Indulged puppies scored significantly 
higher on a standardized timidity test 
also underscores the presence of fear- 
fulness around humans. 

These results suggest that punishment 
has added in some way to the per- 


sions of 


reinforcement effect, 
was received on a 50% partial reinforce- 
ment basis and if punishment had no 
effect at all or even a mildly inhibitory 


inforcement. 
effect of 
that due to partial reinforcement, it 
would have been necessary to have had 
a control group that received 50% re- 
ward and 50% nonreward to have kept 
the total number of training sessions 
constant, and to have employed only one 
type of punishment in the experimental 
group. P 
Although this paper is primarily con- 
cerned with extinction effects, a number 
of studies have found punishment to 
produce a facilitating effect during the 
acquisition phase of learning. Such an 
effect in acquisition may or may not 
involve the same mechanisms as a simi- 
lar effect in extinction, but is closely 
enough associated with the present 
problem to warrant inclusion in this re- 
view. In a series of studies using rats 
as subjects Muenzinger has demon- 
strated convincingly that shock plus 
eventual food reward associated with 
the correct response in a visual discrimi- 
nation problem in a T maze produces 
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more rapid learning than just reward 
by itself (Muenzinger, 1934; Muen- 
zinger, Bernstone, & Richards, 1938; 
Muenzinger & Powloski, 1951). It is 
important to give the shock after the 
choice point since giving it before the 
choice point retards learning (Muen- 
zinger & Wood, 1935) ; also pretraining 
to approach shock in order to get food 
produced a facilitating effect in a sub- 
sequent discrimination problem where 
shock was associated with the correct 
choice (Muenzinger & Baxter, 1957; 
Muenzinger, Brown, Crow, & Powloski, 
1952). Muenzinger suggested that these 
results might be accounted for on the 
basis of the emphasis or alerting effect 
of shock. 

Freeburne and Taylor (1952) report 
more rapid discrimination learning when 
rats were shocked for both right and 
wrong responses than in a no-shock con- 
trol group. Prince (1956) failed to con- 
firm the findings, although he did find 
the usual shock-right facilitating effect, 
and in addition found that the shock- 
right effect was greater the more trials 
the rats were allowed to get food re- 
ward for making the correct response 
before shock was introduced. Wisch- 
ner (1947) failed to obtain the shock- 
right effect in a noncorrection situation, 
and Fairlee (1937) found that learn- 
ing was markedly retarded if shock was 
given at the “moment of choice" in a T 
maze, 

Although most of the above rat stud- 
ies indicate that shock associated with 
the correct response can facilitate learn- 
ing, none bears conclusively on the ques- 
tion of the effect of punishment on the 
goal response, since in all cases the on- 
set of shock occurred during or im- 
mediately after the rat left the choice 
point in the maze, and the rat made 
the goal response after shock termina- 
tion. Thus, the running-toward-the- 
goal-box response can be conceived as 
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being reinforced by shock termination 
upon completion of the total response as 
well as by whatever food rewards may 
be provided. 

Drew (1938) did administer shock 
directly through the food in the goal 
box in a discrimination learning prob- 
lem when the rat made the correct re- 
sponse. He found in comparison to a 
no-shock group that learning was facili- 
tated to about the same extent as in 
shock-right or shock-wrong conditions 
where shock was given immediately after 
the choice point. In a different situa- 
tion altogether Holz and Azrin (1962) 
trained two pigeons to peck a plastic disc 
for food reward under a fixed-interval 
reinforcement schedule. A moderately 
intense electric shock given with every 
response produced no permanent change 
in responding. When this same punish- 
ment was given only during the first 
part of each interval, response rate de- 
creased in this part of the interval. 
However, when this punishment was 
given only during the last part of the 
interval, the punished responses actually 
increased slightly, while the unpunished 
responses decreased. The results of both 
Drew, and Holz and Azrin suggest that 
punishment may have a facilitating ef- 
fect even when it occurs in close spatial 
or temporal proximity to the goal re- 
sponse, Holz and Azrin suggest that 
the results of their study can be ac- 
counted for on the basis of the "dis- 
criminative properties" of punishment. 
"That is, punishment by virtue of past 
association with reward now serves as 
a cue tending to elicit the rewarded 
response. 

There are only a few studies that 
have investigated extinction effects 
after rewarding and punishing the same 
response, Farber (1948) shocked rats 
at a choice point in a T maze while the 
rats were learning a position response 
and found much greater resistance to 
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extinction in these rats than was the 
case for nonshocked controls. Farber 
suggested that during extinction the fear 
conditioned to the earlier parts of the 
maze could be reduced by going to the 
always “safe” goal box. Again, this ex- 
planation does not necessarily include 
the situation where the goal response it- 
self is punished. 

Logan (1960) reports extinction data 
in a study that is very relevant to the 
present issue. Rats learned to run 
down a straight runway for food re- 
ward, and in some cases were given a 
150 millisecond shock of progressively 
increasing intensity in the goal box when 
they were an inch or two in front of 
the food cups. Groups were run under 
a variety of conditions, but those most 
appropriate for comparison here are as 
follows: one group received 100% re- 
ward and no shock; a second group re- 
ceived 100% reward and 50% shock; 
and a third received 50% reward and 
50% shock with reward and shock never 
occurring in the same trial. Although the 
extinction curve of the third group is 
presented in a different graph from the 
others and no statistical analyses are 
provided for these comparisons, the re- 
sults appear to clearly confirm the pres- 
ent thesis. The 50% reward-50% shock 
group showed the least tendency to 
extinguish, whereas the 100% reward 
group showed the greatest rate of ex- 
tinction. In fact, the 50% reward-50% 
shock group showed no tendency to ex- 
tinguish whatsoever over the 48 extinc- 
tion trials. 

Unfortunately Logan does not pro- 
vide extinction data on a 50% reward, 
5095 nonreward group, which would al- 
low one to see if shock produced an 
effect beyond that expected by partial 
reinforcement, Actually, such a com- 
parison might not be a clear-cut test 
of the facilitating effects of punishment. 
Many theories have been proposed to 


account for the partial reinforcement 
effect. To the extent that the "gener- 
alization decrement" or the ease with 
which animals can discriminate between 
the conditions of acquisition and ex- 
tinction is an important determinant of 
the partial reinforcement effect, this fac- 
tor would work against the facilitating 
effect of punishment, Thus, the half 
shock, half reward group should extin- 
guish more readily than a half reward, 
half nonreward group, since the change 
to extinction should be easier to dis- 
criminate in the former. If in spite of 
this factor working against it, punish- 
ment were to prolong extinction beyond 
that found for a partial reinforcement 
condition it would be dramatic evidence 
for the issue at hand. As mentioned, 
however, results on a regular partial 
reinforcement group were not included. 
The fact that the 100% reward, 50% 
shock group showed greater resistance 
to extinction than the 100% reward 
group clearly supports the idea that 
punishment, shock in this case, pro- 
duces an effect that cannot be accounted 
for in terms of partial reward rein- 
forcement since reward occurred on all 
acquisition trials in both groups. Logan 
(1960) briefly comments that the fa- 
cilitating effect of shock 
suggests that shock acts directly to increase 
the persistence of the approach tendency and 
does not maintain extinction performance 
simply through a lessening of the avoidance 
tendency [p. 221]. 


There is evidence that the presence 
of an electrically charged grid before 
the safe area increases resistance to ex- 
tinction of learned escape responses to 
this goal area beyond that obtained 
when no shock at all is given during 
extinction (Gwinn, 1949; Moyer, 1957; 
Solomon, Kamin, & Wynne, 1953; 
Whiteis, 1956). Contradictory findings 
by Moyer (1955) and Seward and Ras- 
kin (1960) most likely indicate that 
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the phenomenon is obtained only within 
certain boundary conditions, Although 
these studies again point to the possible 
facilitating effect of punishment re- 
ceived before the final goal is achieved, 
the presence of shock during “extinc- 
tion" and the fact that the reward con- 
sisted of escape from shock make these 
studies somewhat tangential to the 
present analysis. 

Ullman (1951, 1952) reports inter- 
esting results in which “compulsive eat- 
ing symptoms" are produced in rats. 
After several free eating sessions in a 
compartment, shock was introduced 
during the first 5 seconds of each minute 
for eight 20-minute sessions, during 
which food was available and the rats 
were hungry. Finally the rats were put 
in the compartment while food satiated 
but given the same shock sequences for 
four 20-minute sessions. The rats ate 
more while the shock was on than off, 
and the relative preference for eating 
during shock increased when they were 
food satiated. These results suggest that 
eating, even when not hungry, reduced 
the aversiveness of the shock; and that 
furthermore the shock stimuli perhaps 
came to serve as both cue and energizer 
for making the eating response when no 
primary hunger motivation was present. 

Punishment has been given to human 
subjects under shock-right and shock- 
wrong conditions in simple maze learn- 
ing situations, and learning was found 
to be superior for shock-right with low 
shock intensity, but superior for shock- 
wrong with high shock intensity (Feld- 
man, 1961). Freeburne and Schneider 
(1955) found that shock-right, shock- 
wrong, and shock for both right and 
wrong responses facilitated learning 
compared to a no-shock condition. They 
also found that continuing shock dur- 
ing extinction produced greater persist- 
ence at making the correct responses 
than when shock was discontinued. A 
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comparison of extinction results be- 
tween a group that received shock for 
right responses during acquisition and 
a group that received no shock for right 
responses during acquisition, where 
neither group received any reinforce- 
ment during extinction, was not re- 
ported. Such a comparison would be 
more relevant to the issue at hand, In 
general, there is some question as to 
how relevant studies of college students 
doing simple learning tasks in psycho- 
logical laboratories can be to the basic 
question being raised, considering the 
complexity of the motivational patterns 
operating in a college student that 
would affect his interpretation of the 
situation and his persistence at the task 
during extinction. 


SoME THEORETICAL CONSIDERATIONS 


The idea that inconsistent adminis- 
tration of rewards and punishments by 
parents will result in psychological “fixa- 
tions” in children is not a new one in 
the literature (cf. Fenichel, 1945, P. 
66). And assuming that the empirical 
phenomenon is a real one, there will, of 
course, be no dearth of explanations for 
it. The question becomes one of what 
kind of formulation is most likely to 
clarify the mechanisms involved and at 
the same time suggest fruitful lines of 
empirical research. Before turning to 
my own preference for a theoretical for- 
mulation, reference will be made to 
other writers’ theories that are pertinent 
to this issue. 

Whiting (Sears et al., 1953) has sug- 
gested that conflict is necessary to pro 
vide drive strength for secondary mo- 
tivational systems, for example: 
only those actions which are followed by both 
reward and punishment become part © 
secondary motivational system [and] the con- 
flict between these two incompatible expect- 


ancies provides the drive strength for instigat- 
ing the originally reinforced action [p. 180]. 
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According to Whiting, conflict per se 
would seem to be the source of facilita- 
tive drive. In another publication Whit- 
ing and Child (1953), in attempting to 
account for displaced aggression, intro- 
duce the notion that acquired fear re- 
sulting from previous punishment adds 
to the general drive level present and 
thereby increases the strength of the 
displaced response beyond that expected 
from ordinary stimulus generalization. 
Although Miller (1944, 1959) has 
both theorized and experimented exten- 
sively with respect to conflict behavior, 
he has not dealt to any great extent 
with the condition in which the punish- 
ment is applied in such a way that the 
animal never stops making the complete 
approach response; nor has he consid- 
ered the effect of such antecedent train- 
ing on extinction. Miller (1959) does, 
however, suggest in passing the possi- 
bility that punishment may have a fa- 
cilitating effect upon the approach 
response. 
It is entirely possible that administering at the 
goal shocks that are too weak to stop the 
animal from approaching and eating will be 
found to have the dynamogenic effect of in- 
creasing speed of running or strength of pull 


instead of reducing them as would be expected 
from algebraic summation [p. 225]. 


Festinger’s (1961) theory of cogni- 
tive dissonance would seem to predict 
the phenomenon at issue here, although 
Festinger’s theory so far has been ap- 
plied primarily to the partial reinforce- 
ment situation or situations in which 
subjects had to exert effort or submit 
to boredom rather than experience clear- 
cut punishments. Nevertheless, it would 
seem to follow from the general line of 
thinking involved in dissonance theory 
that if a subject has experienced pun- 
ishment as well as reward associated 
with a goal response, he would be in- 
clined to reduce dissonance by telling 


himself that this was a wonderful goal 
response indeed, and well worth persist- 
ing for. 

Continuing in the cognitive vein, 
Tolman’s (1948) conception of nar- 
rowed cognitive fields produced by high 
motivation or stress would seem to pre- 
dict perseveration in extinction after 
reward and punishment. That is, be- 
cause he is not attending to other pos- 
sibilities the animal would continue 
making the same response much longer 
than if his cognitive field had not 
been constricted by stress. Easterbrook 
(1959) brings what amounts to this 
same theoretical position up-to-date in 
an excellent integration of research that 
can be accounted for by the notion of 
cue utilization, essentially the same no- 
tion as Tolman’s cognitive field. 

My own preference in theoretically 
analyzing the situation in which reward 
and punishment are associated with the 
same goal response is to employ some 
of the constructs used in current S-R 
behavior theory. The following repre- 
sents logical deductions of a relatively 
nonquantitative sort from these con- 
structs with regard to this particular 
situation rather than the introduction 
of any basically new constructs. 

This analysis involves in part an ex- 
tension of the formulations developed 
by Amsel (1958) to account for the ef- 
fects of nonrewarded trials when given 
in association with rewarded trials. Fol- 
lowing Amsel it is suggested that 77-5; 
be considered the general term to apply 
to all anticipatory responses and their 
stimulus properties, that become con- 
ditioned to stimuli that precede the 
goal response. Let 7,-Sr, then, represent 
a subdivision of ry-sy which includes all 
anticipatory reward responses and their 
associated stimulus properties, and let 
7-5, represent anticipatory punishment 
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responses and their stimulus properties.” 
In this formulation 7,-s, refers to more 
than the merely observable fractions of 
a consummatory response such as chop 
licking or salivating, and is meant to 
include any central nervous system or 
other reaction conditioned to stimuli 
that precede a rewarded goal response. 
Such r,-s, have activating or energizing 
properties and may also participate in 
the secondary reinforcement process in 
the sense that any change in the stimu- 
lus situation producing an increase in 
7,5; would be reinforcing. Spence 
(1960), however, would apparently re- 
strict the function of 7,-s, to an ener- 
gizing one and not include a secondary 
reinforcing property. 

The 7,-sp subdivision of r,-s, is con- 
ceived to be similar to but perhaps 
somewhat more inclusive than the con- 
struct of acquired fear. Thus, 7-5, 
consists of all reactions, observable or 
otherwise, that become conditioned to 
stimuli that precede an aversive or 
punishing experience. 

And similar to r,-5,, 75-5, have ener- 
gizing properties, and in addition have 
the capacity to play a role in the proc- 
ess of secondary negative reinforce- 
ment, that is, any change in the stimu- 
lus situation producing a decrease in 
7,5, would be reinforcing. Thus, in 
simplified summary, when they occur 
separately the presence of r,-s, will cause 
the animal to attempt to increase these 
7,5, in number and strength. The com- 
plicating feature in this analysis becomes 
apparent when we consider the possibil- 
ity that 7,-s, may come to serve as condi- 
tioned stimuli to evoke 7,-s,, and at the 


2 This terminology might be confusing in 
that Spence and his associates have habitually 
limited the use of r,-s, to anticipatory reward 
reactions. However, it seems to the writer 
that Amsel’s suggestion was a good one, and 
that the interest of overall clarity is best served 
by defining r,-s, as the generic term. Thus, 
r--se is similar to the more restricted meaning 
that r,-s, has been given in the past. 
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same time 7,-s, may come to serve as 
conditioned stimuli to elicit r,-s,. 

The constructs of 7,-s, and r-s, are 
similar, respectively, to anticipatory 
positive affect change and anticipatory | 
negative affect change suggested by 
McClelland, Atkinson, Clark, and Lowell 
(1953). And accordingly, deductions 
from the McClelland et al. theory of 
motivation should lead to similar ex- 
pectations with regard to the effect of 
rewarding and punishing the same goal 
response. Also, the constructs of hope 
and fear proposed by Mowrer (1960) 
in the revision of his two-factor learn- 
ing theory are no doubt similar, re- 
spectively, to 7,-s, and fp-Sp. 

Amsel (1958) employed the term 
r,;-sy, to refer in similar fashion to the 
conditioned fraction of the frustration 
state induced by nonreward. One might 
think of punishment in terms of a dimen- 
sion of severity starting at nonreward 
frustration, in which case rp-sp become 
simply stronger r;-s;. However, the term 
punishment as used in this paper is re- 
served for situations in which the animal 
is subjected to clearly noxious stimula- 
tion. It is possible, though, that the in- 
evitable frustration or conflict accom- 
panying a punished response adds to the 
nature and strength of rp-5p. 

There are three primary considera- 
tions that would lead to the expectation 
of greater resistance to extinction with 
punishment than without it. The first 
reason is that more activation or drive 
should be present than if only reward 
had been given in acquisition. The 
sources of this additional activation can 
be somewhat arbitrarily separated as 
follows: (a) anticipatory punishment, 
Ty-Sp, elicited by external or internal 
stimuli preceding the goal response 
that would have been present if there 
had been no reward or reward motiva- 
tion, such as thirst or hunger; and (^) 
75-5, elicited by drive stimuli associated 
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with reward motivation and stimuli as- 
sociated with anticipatory reward, f,- 
s. The a and b division simply serves 
to point up two important sources of cues 
to which 7p-Sp has been conditioned and 
with respect to which extinction of 7p- 
s, must occur. The drive or energizing 
properties of rp-sp should facilitate the 
dominant approach tendency during ex- 
tinction, perhaps by a multiplicative 
relationship with the strength of the 
approach habit. That increased activa- 
tion or arousal does facilitate a domi- 
nant response tendency in a given situa- 
tion, at least within certain limits, 
hardly needs documentation. 

The second consideration is that dur- 
ing the learning process the anticipatory 
punishment responses, 7-Sp, have been 
part of the stimulus situation in which 
the approach response has been rein- 
forced and made dominant over avoid- 
ance responses, and thus eventually rp- 
Sp serve as additional cues tending to 
elicit approach responses. This is es- 
sentially the mechanism that Amsel 
(1958) employed to account for the 
partial reinforcement effect; namely, 
that the stimulus properties of the an- 
ticipatory frustration reaction, 77-57, 
come to serve as cues eliciting the ap- 
proach response. Likewise, Holz and 
Azrin (1962) employ the same idea in 
proposing that the discriminative pro- 
perties of punishment come to elicit the 
goal response. 

It is even possible that 75-s; come to 
elicit additional amounts of 7;-s, and 
that accordingly there is more antici- 
patory reward or secondary reinforce- 
ment experienced at a distance from the 
goal than would have been the case 
without punishment. In general, either 
overt approach responses or anticipa- 
tory reward responses associated with 
the stimulus properties of 75-55 should 
prolong extinction. The control over 
the dominant response by the stimulus 


properties of r,-s, may, in part, corre- 
spond to Tolman's notion of stress in- 
duced narrowing of the cognitive field. 

The third reason that punishment 
should prolong extinction is that there 
is good evidence that anticipatory pun- 
ishment, 7,-s,, is more resistant to ex- 
tinction than acquired reward, r,-5,. 
Solomon and Wynne (1954) summarized 
the research bearing on this assumption 
and were so impressed by the resistance 
to extinction of acquired fear, especially 
when the initial reinforcement was quite 
severe or traumatic, that they postulated 
a partially irreversible effect, that is, 
that acquired fear never completely ex- 
tinguishes. Thus, if some aspect of 75-55 
does extinguish relatively slowly, its 
effect both in terms of augmenting the 
general level of activation present and 
as a stimulus associated with the ap- 
proach response should persist for a 
relatively long time. 

There are, of course, factors working 
against the thesis proposed in this paper. 
Two such factors would seem to be of 
primary significance. First, some avoid- 
ance tendency in all likelihood does 
develop with respect to a goal response 
that has been punished, and such an 
avoidance tendency might be quite re- 
sistant to extinction. Second, the con- 
ditions of acquisition and extinction 
should be more readily discriminated 
when punishment as well as reward is 
discontinued, This would be especially 
true if the comparison group involved 
partial reward reinforcement. 

It should be emphasized again that 
the hypothesized increased resistance to 
extinction effect of punishment is ex- 
pected to occur only within certain 
boundary conditions, the most impor- 
tant of which involves the strength of 
avoidance relative to approach tenden- 
cies produced in acquisition. Factors 
such as the intensity and frequency of 
punishment, the gradualness with which 
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it is introduced, and the amount of re- 
ward-only training given before punish- 
ment is introduced are undoubtedly im- 
portant in creating the circumstances 
under which the expected effect will 
occur, 


REFERENCES 


AMsEL, A, The role of frustrative nonreward 
in noncontinuous reward situations. Psychol. 
Bull., 1958, 55, 102-119. 

Drew, G. C. The function of punishment in 
learning. J. genet. Psychol, 1938, 52, 257- 
267. 

EASTERBROOK, J. A. The effect of emotion on 
cue utilization and the organization of be- 
havior. Psychol. Rev., 1959, 66, 183-201. 

Eorasn, A. Fixation and inhibition. J. ab- 
norm. soc. Psychol., 1954, 49, 241-245. 

Famer, C. W., Jr. The effect of shock at the 
moment of choice on the formation of a 
visual discrimination habit. J. exp. Psychol., 
1937, 21, 662-669. 

Farser, I. E. Response fixation under anxiety 
and nonanxiety conditions. J. exp. Psychol., 
1948, 38, 111-131. 

Fetoman, S. M. Differential effects of shock 
in human maze learning. J. exp. Psychol., 
1961, 62, 171-178. 

Fentcuer, O. The psychoanalytic theory of 
neurosis. New York: Norton, 1945. 

Festrncer, L. The psychological effects of in- 
pram rewards. Amer. Psychologist, 1961, 

1-11. 

Fisner, A. E. The effects of differential early 
treatment on the social and exploratory be- 
havior of puppies. Unpublished doctoral dis- 
— Pennsylvania State University, 

Freesurne, C. M., & Scuneter, M. Shock 
for right and wrong responses during learn- 
ing and extinction in human subjects. J. exp. 
Psychol., 1955, 49, 181-186. 

FnEEBURNE, C. M., & Taytor, J. E. Discrimi- 
nation learning with shock for right and 
wrong responses in the same subjects. J. 
comp. physiol, Psychol., 1952, 45, 264-268. 

Gwinn, G. T. The effects of punishment on 
acts motivated by fear. J. exp. Psychol., 
1949, 39, 260-269. 

Horz, W. C., & Azrty, N. H. Interactions be- 
tween the discriminative and aversive prop- 
erties of punishment, J. exp. Anal. Behav., 
1962, in press. 

Locan, F. A. Incentive. New Haven: Yale 
Univer. Press, 1960. 

McCLELLAND, D., ATKINSON, J. W., CLARK, 
R. A, & Lowett, E. L. The achievement 


BARCLAY MARTIN 


motive. New York: 
Crofts, 1953. ] 
Mater, N. R. F. Frustration: The sti ly o 
behavior without a goal. New York: M 
Graw-Hill, 1949. 
MassERMAN, J. H. Behavior and 
Chicago: Univer. Chicago Press, 1943. 
Murer, N. E. Experimental studies of com 
flict. In J. McV. Hunt (Ed.), Persona 
Vol. I. 


Appleton-Ci 


and the behavior disorders. 
York: Ronald, 1944. 

Mutter, N. E. Liberalization of basic S-R 
concepts: Extensions to conflict beh: 
motivation, and social learning. In S. 
(Ed.), Psychology: A study of a s 
Vol. I. Sensory, perceptual, and physiol 
formulations, New York: McGra 
1959. Pp. 196-292. 

Mownrn, O. H. Learning theory and be 
New York: Wiley, 1960. 

Moyer, K. E. A study of some of the 
of which fixation is a function. J. 
Psychol., 1955, 86, 3-31. 

Mover, K. E. The effects of shock on 
motivated behavior in the rat. J. 
Psychol., 1957, 91, 197-203. 

Muenzincer, K. F. Motivation in learning: I. 
Electric shock for correct response in t 
visual discrimination habit. J. comp. phystol. 
Psychol., 1934, 17, 267- 277. 

Muenzincer, K. F., & Baxter, L. F. The effect 
of training to approach vs. training to escape: 
from electric shock upon subsequent discrimi- 
nation learning. J. comp. physiol. Psychol, 
1957, 50, 252-257. y 

Muenzncer, K. F., Berxstone, A. Hy & 
Ricaarps, L. Motivation in learning: VIM. 
Equivalent amounts of electric shock N 
right and wrong responses in a Vi 5 
crimination habit. J. comp. physiol. Psychol, 
1938, 26, 177-186. 

Muenzincer, K. F., Brown, W. 0. CROW; 
W. J, & Powroskr, R. F. Motivation 
learning: XI. An analysis of electric shi 
for correct responses into its avoidance 
accelerating components. J. exp. Psych 
1952, 43, 115-119. 

MuzsziNGER, K. F., & Powrosx1, R. F. Moti 
vation in learning: X. Comparison 
electric shock for correct turns in a correct 
and non-corrective situation. J. exp. PS. 
chol., 1951, 42, 118-124. 

Muenzincrr, K. F., & Woop, Arpa. Motiv: 
in learning: IV. The function of pu 
ment as determined by its temporal relati 
to the act of choice in the visual discrimi 
tion habit. J. comp. physiol. Psychol., 
20, 95-106. 

Prince, A, I, Jr. Effect of punishment om 


REWARD AND PUNISHMENT IN MOTIVES A51 


Patterns of child rearing. White Plains, 
N. Y.: Row, Peterson, 1957. 

Seas, R. R, Waurrso, J, W. M, Nowrss, V.. 
& Sears, P. S. Some child-rearing antece- 
dents of aggression and dependency in young 
children. Genet, Psychol. Monogr, 1953, 47, 
135-234. 

Sewan, J. P., & Rasxix, D. C. The role of fear 
in aversive bebavior. J. comp. physiol. Psy- 
chol., 1960, 53, 328-335. 

SotoMoN, R., L., Kana, L. Jọ & Wvwwr, J. C. 
Traumatic avoidance learning: The out- 
comes of several extinction with 
dogs. J. abmorm. soc. Psychol, 1953, 48, 
291-302. 

Soromox, R. L., & Wyxxe, L. C. Traumatic 
avoidance learning: The principles of anx- 

iety conservation and partial irreversibility. 

Psychol. Rev., 1954, 61, 353-385. 


discrimination learning in a non-correction 
situation. J. exp. Psychol, 1947, 37, 271-284. 
(Received February 19, 1962) 


Psychological Bulletin 
1983, Vol. 60, No. 5, 452-459 


A RECONSIDERATION OF THE EXTINCTIO 
HYPOTHESIS OF WARM UP IN MOTOR 
BEHAVIOR 


M. P. FELDMAN 
Institute of Psychiatry, University of London 


The extinction hypothesis derives from that part of Hullian theory which 
is concerned with inhibitory processes, It attributes a significant portion 
of postrest warm up (WU) to the extinction of conditioned inhibition 
(sIx) early in postrest practice. Experiments specifically designed to test 
the hypothesis provide supporting evidence. The boundary conditions of " 
the hypothesis are specified. The few studies which satisfy these condi- vB 
tions tend to favor the extinction hypothesis, rather than the competing f 
set and interference hypotheses. An outline is given of the experimental id 
design suggested by the extinction hypothesis as most likely to lead to N 
development. 
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In a recent paper, Adams (1961) 
considered the evidence for the view 
that the immediate postrest increment 
(henceforth referred to as WU as a 
convenient symbol with no explanatory 
implications) in verbal and motor learn- 
ing arises from conditions other than 
direct interference with goal responses 
during rest. He discussed two explana- 
tory hypotheses, namely, set (Ammons, 
1947a) and inhibition, the latter solely 
in connection with the learning of mo- 
tor skills. The writer considers that 
Adams was too cavalier in his treatment 
of the inhibition hypothesis, suggested 
by Eysenck (1956), and the present 
paper will attempt to repair this omis- 
sion. Certain of the evidence to be con- 
sidered was either not yet published, 
or was unavailable when Adams com- 
pleted his survey. The convention of 
measuring WU as the difference between 
the first postrest trial and the score on 
the trial at the peak of the rise before 
the decremental segment begins (Adams 
1952) is adhered to throughout. The 
method suggested by Ammons (1947b) 
would only be appropriate if his ex- 
planation of postrest increment as re- 
covery of set were correct. It is precisely 
this issue which is raised by the present 
paper. In addition, it has been pointed 
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out (Eysenck, 1956) that the A 
measure interferes with the 
ment of reminiscence. i 
The discussion will be confined to 
motor behavior, on which most of the 
work relevant to the inhibition hypothe- 
sis has been carried out. Adams’ crite- 
rion for ascribing a portion of WU in- 
crement to recovery of set, rather than: 
to extinction of conditioned inhibition, 
involved the demonstration that WU 
would be largely abolished by perform- : 
ance of a neutral task during the re- - 
tention interval. With respect to motor 
behavior he concluded “efforts to locate - 
a neutral task which would influence 
WU... have met with failure." The | 
explanation for this failure might well 
be the inadequacy of the set hypothesis. 
There is, of course, an implicit circu- 
larity in the neutral task technique. 
In order to demonstrate this point, let 
us suppose a task which is assumed to - 
be neutral, and performance on which, 
during rest, has indeed resulted in 
reduction. The explanation for 
might well be that such a task is 
neutral, but involves primary S-R 
quences in common with the criterio 
task, so that the reduction effect is du! 
to the strengthening of these and no 
of secondary goal responses. Location - 
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it 
possible, on a priori grounds, to 
which tasks are neutral. Hence, it 
seem that consideration of the set hy- 
pothesis in the field of motor behavior 
has to wait upon further research on the 
muscular substrata of motor task re- 
sponses, as Adams indeed suggests. 
However, if this is true, it cannot also 
be true that “the set hypothesis has the 
most status” (Adams, 1961, p. 261) so 
far as motor behavior is concerned. At 
present there is no basis for differentiat- 
ing between the set hypothesis and the 
interference hypothesis (e.g, Under- 
wood, 1957) in psychomotor tasks. 
The inhibition hypothesis while not 
denying the possible contribution of set 
and interference maintains that a sig- 
nificant portion of WU is due to the 
extinction of the nonresponding habit 
(slp) conditioned during previous prac- 
tice. The particular inhibition hypothe- 
sis under discussion forms part of a 
comprehensive set of postulates deriving 
from Hull (1943) and Kimble (1949) 
and elaborated by Eysenck (1956). 
Adams’ (1961) account though brief 
was satisfactory, but did not include 
the part played by drive (Eysenck & 
Willett, 1961). This is considered to be 
as follows: Reactive inhibition is re- 
garded as a negative drive which ac- 
crues until it has canceled out the posi- 
tive drive (D) active in the testing 
situation. At this point performance 
stops and an involuntary rest pause 
(IRP) ensues. During this IRP, Ir 
dissipates and performance is then re- 
sumed until Zp again equals D when 
another IRP occurs, and so on. A rest- 
ing response is thus conditioned as a 
habit, and it is maintained that early 
in postrest performance this response is 
extinguished through nonreinforcement, 


separated by 5-minute rests, there was 
little, if any, WU (Eysenck, 1960). 
Star (1957) used 16 cycles of 90-second. 


the length of prerest practice for whic 

WU is absent should be greater for 
high than for low drive subjects. In the 
experiments concerned with these pre- 
dictions “drive” has been manipulated 
by having applicants for engineering 
apprenticeships practice on the pursuit 
rotor, ostensibly as part of the battery 
of entrance tests which they were tak- 
ing to secure employment with a major 
engineering firm. The applicants were 
well aware that only one in five of them, 
at best, would be selected for interview. 
The low drive group consisted of sub- 
jects of similar age, already working as 
apprentices and knowing that their task 
performance had no implications for 
their future employment. Brown (1961) 
has offered several criteria for the iden- 
tification of motivational variables. He 
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considers that “tending to facilitate or 
energize several different responses” is 
the one most generally accepted. The 
experimental setup described above has 
so far been demonstrated to produce 
predicted and significant effects in length 
of spiral aftereffect (Eysenck & Hol- 
land, 1961), pursuit rotor reminiscence 
(Eysenck & Maxwell, 1961; Eysenck 
& Willett, 1961), maintained handgrip 
(Feldman, 1961), and nonsense syllable 
learning (Willett & Eysenck, 1962). It 
thus seems reasonable, while purists 
might dislike the use of the term drive 
in any but a deprivation context, to 
consider that a motivational variable is 
operating. The following results have 
been obtained to date, using the pursuit 
rotor task. WU was absent in a high 
drive group following 3 minutes of pre- 
rest practice, but present in a low drive 
group (Eysenck & Willett, 1961). The 
same authors demonstrated that WU, 
following 6 minutes of prerest practice, 
while present in a high drive group was 
very much less steep than for a low 
drive group (see Figure 1). A similar 
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finding was noted for 8 minutes of pre- 
rest practice by Eysenck and Maxwell 
(1961). Presumably the set hypothe- 
sis would account for these findings by 
assuming that high drive subjects are 
less prone to losing secondary goal re- 
sponses during rest. Similarly, the in- 
terference hypothesis might assume that 
highly motivated subjects are less likely 
during rest to carry out S-R sequences 
which would interfere with goal S-R 
sequences, Both assumptions lose some 
of their plausibility in that neither 
group of subjects was aware that prac- 
tice would be resumed postrest. 

From the above discussion it follows, 
not only that the inhibition hypothesis 
possesses some merit, but also that it 
is most important to specify the condi- 
tions of practice and distribution under 
which phenomena are predicted to vary; 
and that a theory which does so is of 
great heuristic value. There is available 
in the field of pursuit rotor learning a 
considerable body of empirical data 
(e.g; Ammons, 1947b, 1950, 1951; 
Ammons & Willig, 1956) concerning the 
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Fic, 1, Illustration of warm-up decrement under conditions of high and low drive following 2 
cycle of 6 minutes massed practice and 6 minutes rest on the Rotary Pursuit Test (from Eysenck & 
Willett, 1961). 
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relationship between conditions of prac- 
tice and distribution and such phe- 
nomena as time on target, reminiscence, 
and postrest increment. It is unfortu- 
nate that in his theoretical discussion 
Adams should prefer to use blanket 
terms such as “massed” and “well 
spaced," rather than specify the actual 
conditions to be tenuous and quotes 
particularly necessary when testing de- 
ductions, it being most important to 
keep within the boundary conditions 
specified by the hypothesis under test. 
For instance, Adams considered Ey- 
senck's finding (1956) of WU under 
massed as opposed to space practice 
conditions to be tenuous and quotes 
contrary evidence. Before discussing 
this, the boundary conditions of the in- 
hibition hypothesis will be specified as 
precisely as possible. 

1. Generally, the inhibition hypothe- 
sis considers WU to consist of two com- 
ponents: a small increment due to true 
“warm up” after rest, and a larger in- 
crement due to extinction of the resting 
response. The prediction for massed 
and distributed practice groups thus 
would mot be that little WU will be ob- 
served when conditions are well spaced, 
but that WU following massed practice 
should be significantly greater than that 
following well spaced practice, that is, 
an increment due solely to the contin- 
ued massing of practice beyond a cer- 
tain point. 

2. Specifically, it is necessary to de- 
lineate precisely what is meant by well 
spaced and massed, in order that the 
prediction might be adequately tested. 
(a) Whenever there is a possibility that 
subjects have been specially motivated 
massed practice periods not in excess 
of about 6 minutes are suspect. Ideally, 
to avoid all doubt, massed practice 
should continue for at least, say, 10 
minutes, (b) Massed practice should 
be truly massed, that is, the subject 


should not be required to stop every 
20 seconds or so to allow the experi- 
menter to reset the clock, (c) Prefera- 
bly, WU should be studied on sessions 
within, rather than between, days. (d) 
With respect to the distributed prac- 
tice group, the interval between practice 
periods should be sufficiently long to 
allow dissipation of all the inhibition 
built up during each period. Failing 
such an optimal arrangement of work 
and rest the suspicion arises that suffi- 
cient inhibition might have accumulated 
in later cycles to give rise to IRPs, and 
hence enable the conditioning of the rest- 
ing response. The effect would again be 
that WU for the distributed group might 
be almost as great as for the massed 
group. Distributed practice usually con- 
sists of several practice periods of 30 sec- 
onds or less and when such a group remi- 
nisces to almost the same extent as a 
massed group which has practiced for 
several minutes, such an interpretation 
seems likely. It is a generalization, from 
experimental evidence, that reminis- 
cence is a negatively accelerated in- 
creasing function of prerest massed 
practice of up to 5 minutes duration 
(Ammons, 1947b). Data on optimal 
work-rest cycles have come from several 
workers, Kimble and Bilodeau (1949) 
found the total time on target to be in 
the following descending order of work- 
rest cycles (the first member of a pair 
denoting seconds of work time; the 
second, seconds of rest time): 10-30, 10- 
10, 30-30, 30-10. Ammons (1950) had 
his distributed group practice for 36 20- 
second periods separated by rests of 
varying length. Performance level pre- 
rest was better for groups resting be- 
tween work for 50 seconds or for 2 min- 
utes than for a group resting for 20 sec- 
onds. It is also of interest that whereas 
the 20-second rest group showed consid- 
erable reminiscence, none was shown 
by groups resting for 50 seconds or 
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longer. In any comparison with a 
massed group the 50-second rest group 
is clearly a more appropriate repre- 
sentative of distributed practice than is 
the 20-second group. The point will 
be returned to later. Finally, Kimble 
(1949) gave varying lengths of rest be- 
tween 30-second trials. He found that 
as the number of trials increased, the 
length of intertrial rest for optimal per- 
formance increased also. (e) On a pro- 
cedural point, postrest practice should 
be under the same conditions of distri- 
bution for both groups under test. 

It is evident from the foregoing that 
the sample formula “massed as com- 
pared to well distributed" is too all em- 
bracing and the mere listing of experi- 
ments which fall under this general rubric 
is unsatisfactory. The papers listed by 
Adams as contradicting the inhibition 
hypothesis exemplify a wide variety of 
practice conditions within the general 
heading, and as the experiments were 
not specifically designed to test the in- 
hibition hypothesis they may well have 
been lacking in one or more of the above 
criteria. 

Experiments on WU in which ses- 
sions are 24 hours apart are not satis- 
factory as has been pointed out. Ex- 
amples are Adams (1952), Kimble and 
Shatel (1952), and Digman (1959). 
Despite this infringement of boundary 
conditions it is of interest that WU 
was substantially larger for Adams' 
massed group (6-minute practice peri- 
ods) than for his distributed group (10- 
second work periods, 40-second rests) 
on the last two of his 4 sessions. In the 
cases of both Digman and Kimble and 
Shatel massing was not truly complete, 
there being a 2-second pause between 
30-second trials in the former, and a 
10-second pause between 50-second trials 
for the latter. One of the experiments 

cited by Adams, which satisfies the cri- 
terion of successive sessions on the same 
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day, is that by Denny, Frisbey, and 
Weaver (1955). This suffers from 
another shortcoming, namely, that for 
their distributed group the rest periods 
were less than optimal for the length of 
practice given (10 cycles of 30-second 
work periods, 30-second rests). The 
evidence for this is the fact that the 
distributed group reminisced to almost 
the same extent as the massed group (5 
minutes continuous practice). Because 
of this, the possibility arises that Cri- 
terion d, concerning the avoidance oi 
resting response conditioning in the 
later cycles of distributed practice, is 
not met, and the results of this experi- 
ment cannot be used in refutation of 
the inhibition hypothesis. Jahnke and 
Duncan (1956) carried out a very large 
scale experiment involving 6 minutes of 
massed practice, contrasted with 2 con- 
ditions of distribution. Intersession rests 
were either 10 minutes, 1 day, 2 days, 
1 week, or 4 weeks. Postrest testing 
was distributed (21 cycles of 10 seconds 
practice, 20 seconds rest) for all three 
groups. Comparing groups which rested 
for 10 minutes, WU was found to be 
considerably more marked for the 
massed than for either distributed 
group, although both distributed groups 
did show some WU. With intersession 
rests of 1 day or more, differences be- 
tween massed and distributed groups 
progressively decreased. 

Adams cites the results of Ammons 
(1950) as contradicting the inhibition 
hypothesis. Reference has already been 
made to this experiment under Criterion 
d and evidence has been presented that 
groups resting 50 seconds or 2 minutes 
between 20-second trials represent dis- 
tributed practice more adequately. 
Comparing these groups, and not the 
20-second rest group as did Adams, 
with the 12-minute continuous practice 
group it is found that WU on the lat- 
ter is greatly in excess of that on the 
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two former. Postrest practice in this 
experiment was massed for all groups, 
hence, in an experiment in which all 
the criteria for testing the inhibition 
hypothesis are satisfied, the hypothesis 
is substantiated. Another experiment 
in which all criteria are satisfied and in 
which the hypothesis is supported is 
that by Ammons and Willig (1956). 
Four groups of subjects practiced for 
110 minutes (90 minutes training and 
20 minutes in test conditions). Two 
of these groups practiced for 10-minute 
periods, separated by 20-minute rests, 
two for 1-minute periods with 2-minute 
rests. Following the training period one 
group in each condition changed to the 
other condition for the test period, the 
other one practicing under the same 
conditions. Results for WU are clear and 
unequivocal. In the first test period, 
the two groups which had practiced 
under massed conditions during train- 
ing showed considerable WU, the groups 
which had practiced under distributed 
conditions showed only very slight WU. 
Similar results were obtained for the 
second period of test; the two groups 
which practiced continuously for the 
first test period showed very much more 
WU than the two groups which had un- 
dergone distributed practice. 

Two conclusions emerge from the em- 
pirical findings in this field. 

1. When the criteria of the inhibi- 
tion hypothesis are satisfied there is con- 
siderably more WU following continuous 
than well distributed practice. Such an 
increment can only be a consequence 
of prerest massing. The phenomenon 
cannot readily be handled by either of 
the competing hypotheses. The form 
of the inhibition hypothesis under dis- 
cussion explains this finding as due to 
the extinction of the resting response 
conditioned during massed practice. 

2. With conditions of distribution 
such as to exclude the occurrence of in- 
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voluntary rest pauses, some WU is ob- 
served postrest, particularly when ses- 
sions are a day, rather than 10 minutes, 
apart. 

Eysenck (1956) has pointed out that 
the inhibition hypothesis would be seri- 
ously weakened if there was doubt 
about the development of conditioned 
inhibition (sg) as such. There has been 
considerable controversy on this issue, 
for example, Ammons and Willig 
(1956), Reynolds and Adams (1953), 
Eysenck (1956), Reynolds and Bilo- 
deau (1952), and it will not be entered 
into in any detail. Two points, however, 
may be made. Firstly, there appears 
to have been a dearth of really serious 
and determined attempts to demon- 
strate the phenomenon. As an example 
of one such attempt, Woodworth and 
Schlosberg (1954) have cited an experi- 
ment by Karsten (1928) in which sub- 
jects were required to read a short poem 
again and again until they wished to 
stop. When they showed signs such as 
restless movements and slips of the 
tongue they were given another poem, 
and so on, until they refused to read any 
more poems; prose passages were then 
substituted until the subject refused to 
serve any longer. Woodworth has made 
the following comments: "rather a ter- 
rible experiment but with some good re- 
sults; the spread or generalisation of 
satiation, and partial recovery after a 
day's interval, much as in an extinction 
series" (p. 693). For a much more rigor- 
ous experiment, involving water deprived 
rats in a runway situation, and resulting 
in complete extinction of the running 
response, see Kendrick (1958). Sec- 
ondly, it is possible to specify the con- 
ditions under which s/p should appear 
in pursuit rotor learning. These are: 
(a) Practice to be perfectly massed and 
to continue for as long as possible. 
(b) There should be several periods of 
practice. (c) The rest periods should 
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be as short as possible, so that inhibi- 
tion should not completely dissipate 
during rests but some should remain 
when practice is resumed, interfering 
with extinction of the resting response, 
and resulting in progressively earlier 
and greater performance decrements. 
Such an experiment does not yet ap- 
pear to have been performed in the field 
of motor behavior, but would be most 
useful in helping to settle an issue of 
great importance both for learning 
theory and for therapy (Yates, 1958). 
It must also be pointed out that while 
“blocking” (involuntary rest pauses in 
the present terminology) has been 
widely demonstrated, particularly by 
early workers (e.g., Bills, 1931, 1935; 
Philip, 1939; Warren & Clark, 1937), 
there is as yet no direct evidence for 
or against the occurrence of IRPs in 
pursuit rotor learning. The demonstra- 
tion would be diíficult, requiring the 
separation of error from failure to re- 
spond, but should not be beyond tech- 
nical ingenuity. 

Finally, it may be said to be a meas- 
ure of the usefulness of the inhibition 
hypothesis that in spite of this com- 
parative lack of underpinning, it can be 
made to generate predictions which are 
both precise and testable. The results 
of those experiments which have been 
carried out to date are such as to de- 
mand the retention of the hypothesis, 
at any rate for the present.' 


1 Rachman’s finding (1961) that WU was 
absent when a buzzer was sounded for 2 sec- 
onds, 4 1/2 minutes after the beginning of a 
5-minute period of massed practice, but not 
when sounded after 1 minute of practice, while 
incidental to the main purpose of his experi- 
ment, may point to a readily available method 
of manipulating the phenomenon, and would 
be well worth repeating on larger groups. 
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COMMENT ON FELDMAN’S “RECONSIDERATION 
OF THE EXTINCTION HYPOTHESIS OF WARM 
UP IN MOTOR BEHAVIOR” 
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The use of Hull's inhibition postulates for explaining warm-up decrement 
(WU) in motor performance curves is criticized. The logical difficulties 
with these postulates and the empirical support for them are reviewed, 
and the postulates are regarded as expressing theoretical mechanisms and 
relationships of doubtful scientific value. Recent British studies that 
attempt to relate drive, reminiscence, and WU are also criticized because 
of problems in their data and because they fail to consider proved weak- 
nesses of the inhibition postulates. Warm-up decrement, as well as find- 
ings on work and distribution of practice that have been customarily 
explained by the postulate set, are considered to be without satisfactory 


theoretical explanation at this time. 


The University of London investiga- 
tors are late arrivals on the work inhibi- 
tion scene, and they have all the zeal for 
Hull’s (1943, 1952) Jp and slp that a 
number of American experimental psy- 
chologists had a few years ago. Repre- 
sentative of this enthusiasm is Feldman’s 
(1963) reply to my criticism (Adams, 
1961) of Eysenck’s (1956) use of Hull’s 
inhibition postulates to describe warm- 
up decrement (WU). 

It would be profitable to review why 
American workers become disenchanted 
with the theoretical status of Jp and sIr 
for human motor behavior. These con- 
structs were not supplanted with su- 
perior theoretical notions. They fell 
from a state of pure scientific grace 
because of their own inadequacies, and 
the empirical data they purported to 
explain are now in theoretical limbo. In 
my recent paper on WU (Adams, 1961), 
I rejected Eysenck’s explanation of WU 
because his data were inconsistent with 
well-documented characteristics of mo- 
tor performance curves, If journal space 
had allowed, I would have strength- 
ened my disavowal of Eysenck’s inter- 
pretation by showing that his minor 
explanatory successes are destined for 
momentary lustre because the inhibition 
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constructs are logically weak and predic- 
tions from them have a poor record of 
empirical verification, I would like to 
amplify my earlier remarks and show 
that Hull’s inhibition postulates are 50 
shaky that it seems a weak research 
strategy to pursue them any longer— 
either for WU or fatigue-like effects in 
behavior. Regardless of their weak- 
nesses, the postulates should be a topic 
of serious interest for experimental psy- 
chologists because much of what we 
know about fatigue-like effects has been 
through studies within their frame of 
reference. E 

Presuming an acquaintance with 
Hull’s position, here is an overview of the 
salient logical difficulties of the inhibition 
postulates. 

1. In the final statement of his theory 
(Hull, 1952), gp and s/n had different 
operational definitions, despite their both 
being habits. Number of reinforcements 
(N) were the primary operations 1 
sHpr (Postulate IV), with reduction : 
drive stimulus apparently playing a T° s 
(Postulate III). The sIr construct, 0n 
the other hand, was related to the num- 
ber of times 75 was dissipated (press 
ably N) and to the amount of In pres 
(Postulate IX). Hull sees Zr as a nega- 
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tive drive.” sH was not related to the 
amount of any drive. 

2, The asymptote of sl» was never 
defined in relation to the asymptote for 
sig. If they both had the same asymp- 
tote, then the positive and the negative 
contributors to response potentiality 
could eventually negate each other and 
the organism would stop responding. 
However, organisms certainly seem to go 
on responding regardless of how much 
massed practice they have had in the 
past. This difficulty could be adjusted 
in the theory by giving sip a lower 
asymptote than sH g so that a net poten- 
tiality for responding always exists, but 
Hull did not clarify this point. This 
problem could be handled by a weak- 
ening of the resting response whose habit 
construct is s/5 and, in fact, Eysenck 
(1956) offers the extinction of s/p as 
the reason for WU. This is easier to say 
than do in Hull's theory. Habits as such 
are never weakened in Hull's system, 
so to talk about the extinction of sIr 
is wrong. Rather, extinction is the re- 
duction of reaction potential through the 
increase of inhibitory potential when 
nonreinforced responses occur. Assum- 
ing that we could operationally define 
the meaning of nonreinforcement of a 
resting response, the theory requires Jz 
to develop for the resting response whose 
dissipation produces a new slp which 
works counter to the original s/z, and so 
on to absurdity. Most researchers have 
been aware of these difficulties with the 
resting response and its constructs and 
have never pursued any detailed hy- 
potheses about them. 

3. Hull must have been wary of his 
own use of Zz as a "negative drive" 
analogous to painful stimulation which 
the organism seeks to avoid. Nowhere 
did he ever attempt to link Jp with 
primary and secondary motivation and 
drive stimulus. This is a bothersome 
inelegance that makes the theory seem 


internally disjointed because /, stands 
apart as an isolated motivational notion. 
A sophisticated theory would have a 
unified explanation for common mecha- 
nisms. 

On the empirical side, both 4/ s and /a 
have had a thin time of it. The follow- 
ing are the highlights, and the competent 
survey and bibliography by Bilodeau 
and Bilodeau (1961) are suggested for 
further reference. 

1. Those who have researched the 
implications of Hull's inhibition postu- 
lates tried to look kindly upon the logical 
weaknesses and ask if the broad conse- 
quences of the postulates were neverthe- 
less verifiable. The most far-reaching 
deduction surrounded 4/5 and the rather 
inescapable implication that massed 
practice induces a stable level of inferior 
performance to that of distributed prac- 
tice. The ease with which massed groups 
on the Rotary Pursuit Test transitioned 
to the level of a distributed control group 
when they were shifted to distributed 
practice (Adams & Reynolds, 1954; 
Reynolds & Adams, 1953) was convinc- 
ing evidence against the heart of sl n. 
Schucker, Stevens, and Ellis (1953) also 
presented evidence against slr when 
they used improved experimental con- 
trols with the Alphabet Printing Task 
that Kimble (1949) and others had em- 
ployed in studies that supposedly sup- 
ported slp. 

2. The amount of reminiscence over a 
rest interval should be a function of the 
amount of physical work in the prerest 
session. This has not been confirmed 
(Bilodeau, 1952; Bilodeau & Bilodeau, 
1954; Ellis, Montgomery, & Underwood, 
1952). 

3. The Zp construct grows linearly 
with number of responses and dissipates 
as an exponential decay function of the 
time between responses. Adams (1956) 
deduced from these functions, along with 
the negative growth function for reaction 
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potential, the family of acquisition 
curves for a paced task under different 
distributions of practice trials. These 
curves were not found empirically 
(Adams, 1954) and the I; relationships 
become suspect. The problem appears to 
lie with the linear growth of 7; because 
the exponential decay function seems 
sound (e.g, Kimble & Horenstein, 
1948). 

If these large difficulties are not 
enough, then perhaps a discussion of 
problems with the data that Feldman de- 
fends should be sobering. First, to con- 
sider the involuntary rest pause (IRP) 
where the negative drive Zp has increased 
to the point of negating the positive 
drive (D), causing the subject to tem- 
porarily rest until some of the Zx dissi- 
pates. Kimble (1949) developed this 
ingenious interplay between drive and 
Ir, and it had some reasonableness for 
a self-paced task where subjects can 
change their rate of responding or even 
momentarily pause, as the hypothesis 
says. However, IRP is doubtful for a 
paced task like the Rotary Pursuit Test 
where subjects steadily pursue the target 
and never seem to rest during a trial. 
Ammons, Ammons, and Morgan (1958) 
found no empirical support for IRP in 
a careful task analysis of performance 
on the Rotary Pursuit Test. Of course 
it could be argued that the resting re- 
sponse is an inferred state of affairs like 
a mediating response but, speaking cas- 
ually, taking a rest when you are tired 
is an obvious state of affairs and should 
be observable. 

Secondly, Feldman deduces that high 
drive subjects should show less WU than 
low drive subjects, presumably because 
they can tolerate more 75, take fewer 
IRPs, and have fewer opportunities to 
acquire sx. With equal aplomb I can 
deduce the opposite. Hull (1952) holds 
that an sę increment is an increasing 
function of the amount of /, present. 
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Since the high drive subjects have more 
Ig present at the start of the intersession 
rest period than low drive subjects, more 
Ig will dissipate, more s/p will accrue, 
and thus more WU. The derivation of 
contradictory theorems is further proof 
of the looseness of the whole system 
within which the London investigators 
are working. 

Thirdly, it is incumbent upon Ey- 
senck's formulation to show that high 
drive subjects operate according to 
theory in other ways. Not only should 
there be postrest differentiation between 
high and low drive levels, with high 
drive subjects showing more reminis- 
cence, but the energizing drive should 
also produce higher prerest performance. 
Neither Eysenck and Maxwell (1961) 
nor Eysenck and Willett (1961) were 
able to show differentiation of their high 
and low drive groups in prerest per- 
formance. The authors puzzle over this 
and rightly so. As for postrest differ- 
ences, Eysenck and Maxwell found 
clear effects in the expected direction but 
Eysenck and Willet found no difference 
except that a low drive group had a 
larger WU segment than a high drive 
group which, as I have just discussed, 
is an equivocal deduction. 

In my paper on WU (Adams, 1961), 
I sought to defend the thesis that there 
is evidence for forgetting effects that 
cannot be easily ascribed to direct inter- 
ference with goal responses, although 
certainly the evidence is not without 
exception and there is a great need for 
fresh thinking and data. However, clari- 
fication of WU is hardly likely to come 
from Hull’s inhibition postulates, at 
least as they are now phrased, or by 
insisting, as Feldman does, that investi 
gators use a narrow set of boundary 
conditions whose outlines are legislated 
by characteristics of performance on the 
Rotary Pursuit Test. The findings of 
experiments on the set hypothesis of 
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WU in verbal learning (e.g., Irion, 1949) 
are also in need of explanation, as well 
as rapproachement with sister phe- 
nomena from motor learning. 
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ERROR TERMS IN TREND ANALYSES' 


JOHN GAITO anp EDWARD D. TURNER 
Kansas State University 


3 trend analysis procedures are discussed: the Alexander procedure, 
Grant’s partitioned interaction method, and the unpartitioned interaction 
approach of a number of statisticians. By considering expected mean 
squares and the appropriateness of the assumption of homogeneity of 
interaction trend components, it is indicated that the Grant procedure 
is theoretically more correct and realistic than are the other 2. 


Analysis of variance techniques have 
been powerful tools for the scientists in 
recent years. The various anova models 
with their associated expected mean 
squares [E(MS)] developed by mathe- 
matical statisticians have provided a con- 
venience in teaching the subject (S) and 
in attacking troublesome statistical 
problems. General rules for determining 
E(MS) components for various models 
and designs have been provided by a 
number of people (e.g, Cornfield & 
Tukey, 1956; Fraser, 1958; Gaito, 
1960; Green & Tukey, 1960; Green- 
wood, 1956; Scheffé, 1959; Wilk & 
Kempthorne, 1955). Model I (fixed 
effects) and Model II (random effects) 
have been relatively clear-cut and have 
generated no major controversial issues. 
The Mixed Model (fixed and random 
effects) has not fared as well. A few 
years ago a major disagreement devel- 
oped as to whether the interaction vari- 
ance should be included in the E(MS) 
for both mean squares of the random 
and fixed effects. On one side, Mood 
(1950) and McNemar (1955) included 
the term in both mean squares. Other 
individuals placed the interactions only 
in the fixed effect mean square (Ander- 
son & Bancroft, 1952; Greenwood, 
1956; Kempthorne, 1952; Tukey, 


1The authors are indebted to colleagues 
John Overall of the Psychology Department 
and Roshan Chadda of the Statistics Depart- 
ment for a number of interesting discussions 
which greatly aided the formulation of the 
ideas presented in this paper. 
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1949). This issue was resolved by the 
general procedures of Cornfield and 
Tukey (1956) and by Wilk and Kemp- 
thorne (1955). 

Recently the Mixed Model has been 
involved in another potential issue, the 
question as to the proper error to use in 
testing each of the components in a 
trend analysis involving orthogonal 
polynomials. For Treatments x Ss, Trials 
x Ss, Treatments x Blocks, or other de- 
signs involving at least one random fac- 
tor, statisticians tend to differ as to the 
proper error term, which suggests that 
statisticians differ with regard to the 
components included in the E(MS) of 
the various sources. This paper will 
review the various procedures which 
have been offered in testing the linear, 
quadratic, and higher components of the 
quantitative main effects and attempt 
to resolve this problem. 


TREND ANALYSIS PROCEDURES 


Psychologists have not always been 
content with yes-no decisions in their 
statistical analyses. Thus many have 
resorted to the use of regression curves 
or trend analyses to obtain more infor- 
mation than is available in a simple two- 
decision analysis. Trend analyses have 
been used frequently in recent years 
with repeated measurements and other 
designs. Two important presentations of 
trend analyses have influenced psy- 
chologists greatly. Alexander (1946) 
described a useful technique to evaluate 
trends of treatments and individual $5- 
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Grant (1956) further extended the pro- 
cedure so as to provide for orthogonal 
components using orthogonal polyno- 
mials. 

Trend analyses are merely extensions 
of the usual analysis of variance pro- 
cedures. The type of design that is in- 
volved in many repeated measurements 
trend analyses are called *mixed" de- 
signs (Lindquist, 1953) or *partially 
hierarchical" designs (Brownlee, 1960). 
A mixed design that both Alexander and 
Grant were concerned with was Lind- 
quist's Type I design. This involves two 
or more groups of different Ss with each 
S having a measurement on two or more 
occasions. The simplest example is the 
situation in which a control group and 
an experimental group are run in a 
learning experiment. Thus two or more 
trials would be involved. It is important 
that the dimension on which repeated 
measurements are obtained for each S 
be quantitative and involve equal units. 
If equal units are not present, the tabled 
orthogonal coefficients are not applica- 
ble; however, the appropriate coeffi- 
cients can be obtained by solving simul- 
taneous equations discussed by Lewis 
(1960, pp. 407—408). 

Table 1 shows the E(MS) for the 
analysis of a Type I mixed design in 
which we assume that the Grant proce- 
dure is correct. We will show later that 
this assumption is justified. We will use 
Lindquist’s procedure of showing the 
between-Ss and within-Ss components, 
each with their sums of squares parti- 
tioned into orthogonal components. 
This procedure presents a clear picture 
ior the reader and allows one to relate 
the structure of a trend analysis to that 
of the usual nontrend analysis of vari- 
ance. The terms in parentheses for each 
source are those of Grant. The terms of 
Alexander are similar. The table is for 
an experiment in which there are four 
trials. Thus linear, quadratic, and cubic 


components can be obtained in that any 
curve with » points can be fitted exactly 
by a curve of degree n-1. Alexander's 
analysis (his Tables 10 and 16) differs 
slightly from Table 1 in that he obtains 
the linear component (which he calls 
*slope") and the remainder is lumped 
together as a deviation sum of squares 
(shown in Table 1 by the single braces). 
Grant's Table 7 is also different from our 
Table 1 in that he has a curve over five 
points and thus has a quartic component 
as well as the linear, quadratic, and 
cubic ones, plus other orthogonal parti- 
tioned sources which are not pertinent 
to the present topic. However, Table 1 
is sufficient to illustrate the analysis. An 
important point to notice is that the 
between-Ss effects include at least 
oè errors of measurement) and o, 
(sampling error-individual differences) 
whereas all within-Ss components do not 
contain o,?. This indicates that the 
same error term can not be used for all 
F tests. 

Alexander uses the Sx Trials within- 
groups-residual component as an error 
term throughout the analyses. This con- 
sists of the Sx Trials quadratic and 
cubic components. Intuitively one can 
see that a single error term throughout 
the analysis will be inadequate in that 
some sources involve independent group 
comparisons whereas others are based on 
the same group. In the former case, ex- 
perimental error will consist of c," and 
o; in the latter, c," and c? due to S 
interactions are present. Furthermore, 
by separating the TrialexSs within 
groups into two components, linear and 
residual, and using the latter as the esti- 
mate of error, Alexander must assume 
that the ci? components are not present. 

The E(MS) provides a clear picture 
of the bias in Alexander’s procedures. 
If we look at the components included 
in the denominator and numerator of the 
various F tests, we see that most of the 
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TABLE 1 


EXPECTED MEAN SQUARES FOR THE VARIOUS 
CoMPONENTS OF A TREND ANALYSIS FOR 
ExpeRIMENT Invotvinc Four TRIALS: 

Grant PROCEDURE 


Source E(MS)* 
Tacha) 21 bbe) is 2i irc PE 
Between Ss 
Groups (between- 
group means) o) o oy. 
Ss within groups 
(between-individual 
means) ed ro! 
Within Ss 
Trials (overall trend) c4^--en^-- cà 
Linear CP poy Honn 
Quadratic ad + ote +o)” 
Residual 
Cubic | OHT HE 


Trials x Groups (be- 
tween-group trends) c^ +0r +0 


Linear 04 + 0 nay + Onw" 

cue es OP -- 0) + C tota) 

Cubic OHO + Gi) 
Trials x Ss 


within groups (be- 
tween-individual 


trends) [ET 

Linear oe tors 

— Oe + ot)" 
Residual 

Cubic CPt ota)" 


"The coefficients for each component are not in- 
cluded in this and later tables because their presence 
is not relevant to this paper. 


Alexander tests involve a positive bias 
(increase in Type I error) but several 
have a negative bias (increase in Type II 
error). The tests of the linear Trials x S 
source and the between-Ss, within- 
groups components are both negatively 
biased unless ots}? (combination of the 
quadratic and cubic components) is 
equal to zero. The test of between groups 
will be unbiased if e,*—6c;,,57. If o, is 
greater than o:,,,;*, positive bias occurs; 
if the reverse, negative bias. All other 
tests can be positively or negatively 
biased. For example, the expected value 
of the F test [E(F)] of the trials linear 
component is: 
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oe Fono cy 
Oe anc? 


This test will be unbiased only if cts)? 
and osa? are equal. Positive bias will 
occur if ots)? > tsir)”; negative bias, if 
the reverse occurs. Similar results can 
be shown for the other F tests. 

Grant’s procedures are an improve- 
ment over those of Alexander; he pro- 
vides more orthogonal components (giv- 
ing more useful information) and his F 
tests are more appropriate than are 
Alexander’s. He does not test the vari- 
ous components of TrialsxSs within 
groups because no estimate of c^ alone 
is available. Between groups is tested by 
between Ss within groups giving an un- 
biased F test. The latter term is tested 
against the TrialsxSs within groups 
(nonpartitioned). This will produce a 
negative bias if ots? is present; however, 
it is usual to assume that this component 
is zero when there is no within-cells com- 
ponent as an estimate of o,”. 

For the within-Ss portions, Grant 
tests the linear components of Trials and 
of Trialsx Groups by the linear portion 
of Trialsx Ss within groups; quadratic 
components are tested by the quadratic 
term of Trials x Ss; cubic, by cubic; etc. 
Later we will show that the Grant pro- 
cedure is theoretically correct. 

Other statisticians do not follow the 
Alexander procedure for various designs 
utilizing trend analysis. Some suggest 
procedures similar to those of Grant. 
For example, Snedecor (1946) mam- 
tained that the proper error for each of 
the various components (linear, quac- 
ratic, etc.) of the quantitative treat- 
ments effect was the corresponding com- 
ponent of the  Treatments* Blocks 
source. However, he used the overall 
interaction term when the various com- 
ponents were homogeneous. 

Others favor testing each of these 
components by the unpartitioned inter" 
action source of variation. Among those 
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suggesting this procedure were Lindquist 
(1953) and Lewis (1960). However, 
the latter maintained that the trend for 
each S should be the same and that 
homogeneity of variance in each cell 
should prevail. Edwards (1960) also 
favored using the unpartitioned interac- 
tion as the appropriate error but stated 
that the various interaction components 
should be homogeneous. 

Although the statisticians who recom- 
mend the unpartitioned interaction as 
error do not indicate the E(MS) for the 
trend analysis, one might get the impres- 
sion that their anova matrix would be as 
indicated in Table 2. The only differ- 
ence between Tables 1 and 2 is that in 
the latter table the E(MS) for Trials 
and Trials x Groups components contain 
c? (the unpartitioned interaction vari- 
ance), whereas the former table uses 
ci? for linear, otsa? for quadratic, 
and otste)? for cubic. By including ots in 
the E(MS), tests of Trials and T rials x 
Groups components by the unpartitioned 
interaction source of variation would be 
theoretically correct. Below we will in- 
dicate that the E(MS) of Table 1 is 
correct. 

In summary there are three suggested 
methods for a trend analysis (as exem- 
plified by the Type I design discussion), 
the Alexander procedure, the Grant par- 
titioned error method, and the unparti- 
tioned error procedure of a number of 
statisticians. Few individuals would de- 
fend the Alexander procedure (without 
making a number of unrealistic assump- 
tions). Thus the discord is between the 
partitioned and unpartitioned error pro- 
cedures. The major question is: should 
the interaction source of variation be 
used as the error term in the test of each 
component of the quantitative fixed ef- 
fect or should the interaction be parti- 
tioned into separate sources, each of 
which would be used to test a single 
component of the fixed effect? 
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Expectep MEAN SQUARES FOR THE VARIOUS 
Components OF A TREND ANALYSIS FOR 
Experiment lwvornviNG Four Triats: 
ALTERNATIVE PROCEDURE 


Source E(MS) 
Between Ss 
Groups (between- 
group means) cà n 


Ss within groups (be- 
tween-individual 
means) 

Within Ss 

Trials (overall trend) oton +o" 

Linear oton Fou 
Quadratic eoo ou 
} esidual 


Cubic 
Trials x Groups (be- 


cà 


LET ES ee 


tween-group 
trends) opon ou 
Linear opon tooa 


Quadratic ed ou nu 
esidual , i " 
Cubic Oe 4r HOr 


Ss x Trials 
Within groups (be- 
tween-individual 


trends) cé ou! 
Linear LES TN 
Quadratic cà 0a) 
Residual " 
Cubic | LET 


LL ———— 


In attempting to answer this question 
it might appear that the procedure used 
depends on the assumptions which the 
investigator is willing to make. If there 
is a lack of homogeneity, the partitioned 
error is suggested. If homogeneity is 
present, the unpartitioned error is ap- 
propriate. However, we maintain that 
the appropriate error for testing linear 
and higher components does not depend 
on the usual assumptions which the re- 
searcher makes but are indicated by the 
mathematical model. In fact the usual 
assumptions used to justify the unparti- 
tioned interaction as error are irrele- 
vant or theoretically inappropriate and 
unrealistic. 
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In determining the appropriate tests 
in a trend analysis, there appear to be 
two main problems: (a) What are the 
components of variance to be expected 
in the mean square of each source of 
variation? (b) Is it theoretically correct 
to pool the components of the interac- 
tion under the assumption of homoge- 
neity? We will consider each of these 
problems. 


COMPONENTS OF VARIATION 


The components which are expected 
in each mean square can be determined 
by following the E(MS) rules cited by 
the above references. If one is concerned 
with fixed and random effects from an 
infinite or very large population, the 
E(MS) components for each mean 
square are (a) o; (b) o° due to the 
source of concern; and (c) o* involving 
that source interacting with other varia- 
bles which are random. 

From these rules it is easy to show 
that the E(MS) of Table 1 can be de- 
rived. For example, if we are concerned 
with the E(MS) for Trials-linear we 
would have o +o’ toro, using a, 
b, and c above. Individuals following 
the procedure indicated by Table 2 
would suggest that c; rather than ots? 
should appear. However, according to 
Rule c, the latter is present. It is the 
interaction of ora? with the random 
variable, Ss, which occurs. This can only 
be Tisi 

Even though these rules can be used 
effectively in determining E(MS), we 
will also present examples which should 
provide a clear and intuitive understand- 
ing of what components are present in 
each trend analysis design. 

To consider the components in each 
mean square we will take a simpler case 
than the Type I design. Let us begin 
with a TreatmentsxSs design wherein 
the treatments dimension will be quanti- 
tative. Assume that the treatments di- 
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mension consists of four points such that 
the 3 degrees of freedom for Treatments 
can be partitioned into linear, quadratic, 
and cubic components, each with df—1. 
Let us assume also that we have five Ss 
and that we can order them in some 
fashion such that we can partition the 
4 degrees of freedom between Ss into 
linear, quadratic, cubic, and quartic 
components, each consisting of a sum of 
squares with df=1. This allows us to 
show the Treatments X Ss matrix with 12 
cells, each with df=1 (Table 3). Each 
cell shows an interaction between treat- 
ments and S components, ie. TS, 
TS + + +» TS. Therefore, the 
anova matrix is as in Table 4. 

We have ordered both dimensions and, 
thus, both must be considered as fixed 
effects. Therefore, the Treatments X Ss 
components are not present in the 
E(MS) for either treatments or Ss. 
This leads to a situation wherein there 
are no valid tests for any of the treat- 
ments or Ss portions. 

Now let us allow Ss to be a random 
effect (a tenuous assumption in this case 


TABLE 3 


INTERACTION MATRIX SHOWING COMPONENTS 
FOR EACH oF THE Two DIMENSIONS 


el 


Treatment 
ULT SED me ee 
L Q Ç 
L 
Q 
S ASBES TA ETS e 
G 
Q: 


a ana L a 
Menat EE 70 


Note.—Q: refers to quadratic; Qs to quartic. Ba 
cell is based on a sum of squares with d/— 1. 
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TABLE 4 


ANOVA MATRIX FOR THE TREATMENTS X SS 
Dzsio: Borm Errxcrs Frxep 


AND ORDERED 
—————— 
Source df E(MS) 
— np o c. 
Treatments 3 of+od 

Linear 1 cà 
Quadratic 1 cà M0 
Cubic (C) 1 e) roue 
Ss 4 e) ro 
Linear 1 oF reu 
Quadratic 1 Ot Tata 
Cubic 1 ed Hre 
Quartic 1 ee oa. Y 
TxSs 12 e) ou 

LxL 1 oto 
LxQ 1 oF d oae Y 
LxC 1 Otoni 
LX Q: 1 E T N 
QxL 1 e) -Eonawy 
QxQ: 1 cà M 0G) 
QxC 1 o£ ona" 
QxQ: 1 oP - 0i. 
CxL 1 04 +aracer 
CxQ 1 e) Eon 
CXC 1 ETE 
CXQ: 1 ed orca) 


because we have ordered Ss) and make 
use of the matrix in Table 3. The Treat- 
mentsx $ matrix would be partitioned 
into three orthogonal components, each 
with df—4. Each of these consists of 
the four components within each treat- 


ment column, each with dj—1. For the 
linear component the first column of 
Table 3 is used; for quadratic, second 
column; for cubic, third column. The 
E(MS)s are indicated in the table. The 
parentheses around the Treatments x Ss 
interactions indicate that the portions 
are all included in one of the columns in 
Table 3. Because a random variable has 
been introduced, a single interaction 
term involving a treatments component 
with Ss will appear in the E(MS) for 
each treatments portion. Only the linear 
portions of the interaction are included 
in the E(MS) for Treatments-linear be- 
cause the quadratic and cubic compo- 
nents of Treatments are not involved in 
the interaction. Likewise, only the quad- 
ratic interaction appears for Treatments- 
quadratic and cubic interaction, for 
Treatments-cubic. 

Thus with the E(MS) as in Table 5, 
the procedure of testing the Treatments- 
linear component by TreatmentsxS- 
Linear, quadratic by quadratic, and cu- 
bic by cubic would provide valid F tests. 
However, if we order Ss so as to parti- 
tion Ss into orthogonal components it is 
meaningless to consider Ss as a random 
effect. To realistically assume Ss to be 
random we may not order them. Thus 


TABLE 5 


ANOVA MATRIX ASSUMING THAT THE SOURCE, Ss, Is A Ranpom EFFECT 
LLL e ——À—MM 


Source df 


E(MS) 


SONOS h LENNON, A MEET REEL ee 


Treatments 3 e) ou! rod 1 
Linear 1 ad. (onan ena roo! + Otda) Tear 
Quadratic 1 Tet lonan Houa HI T Tti, )) How 
Cubic 1 a4 (onen deuten) + nico + Orten) ) oro 

Ss 4 of +o" 

Linear 1 oe oH)" 
Quadratic 1 0e o 0, 
Cubic 1 Otto 
Quartic 1 ce Osa 

TxSs 12 s) ont . i © 
Linear 4 op (Onan + 0604) T 9 0n T Tr) » 
Quadratic 4 oot (onan Hontan + Gts(a + Tina YO 
Cubic 4 LE (asset + ste Y + tte + Ginny) ) 
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Table 3 would have to be modified to 
produce Table 6 and we can no longer 
partition the 12 degrees of freedom for 
the interaction into 12 orthogonal com- 
ponents as before. However, we can still 
separate three components, each with 
dj=4, which represents the differences 
between Ss in linearity, quadraticity, 
and cubicity. Furthermore, Treatments- 
linear involves only the interaction of 
linear aspects with Ss and thus the 
E(MS) of the linear component would 
be oe? + o%ect)2+ ony? as in Table 7. Simi- 
lar arguments would be offered for the 
quadratic and cubic components. These 
considerations would justify testing 
linear by linear, quadratic by quadratic, 
and cubic by cubic. 

Having considered the E(MS) for the 
Treatments X Ss design, the E(MS) for 
a Type I design follows immediately and 
is that given previously in Table 1. 
These results can also be generalized to 
more complex designs. 

This discussion using E(MS) has in- 
dicated that the Grant procedure for 


TABLE 6 


INTERACTION MATRIX OF TABLE 3 WHEN THE 
Source, Ss, Is Nor Orperep 


Treatment 
L Q e 
5 
Sa 
S Ss 
Ss 
Ss 
| 
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TABLE 7 


ANOVA MATRIX LN FINAL FORM AFTER 
MODIFICATION OF TABLE 5 


Source df E(MS) 

Treatments (T) 3 oe tortor 
Linear Lofton tony 
Quadratic 1 opona oiu 
Cubic 1 ott ota? Foo 

Ss 4 oto 

TxSs 12 oe ors 
Linear 4 oe +r 
Quadratic 4 oe +ots(a)" 
Cubic 4 oP Togo) 


testing the various components of the 
quantitative effect is theoretically cor- 
rect. However, some individuals would 
suggest the pooling of the various Treat- 
ments x Ss components for a more relia- 
ble estimate to use as the error term. 
Let us now turn to this problem. 


HOMOGENEITY AND POOLING 


Those who suggest the use of the un- 
partitioned interaction as error term in 
the F tests of the various components of 
the quantitative main effect utilize one 
or more assumptions to justify this pro- 
cedure. These assumptions are: (a) 
the variances within cells are homo- 
geneous; (b) the trends for each S are 
the same; and (c) the various compo- 
nents of the trend (TreatmentsXS5- 
Linear, Treatments x S-quadratic, etc.) 
are homogeneous. We maintain that the 
first two assumptions are irrelevant to 
the F tests of concern and that the third 
assumption is theoretically inappropriate 
and unrealistic. A 

The homogeneity of cell variances 15 
irrelevant because the concern is not 
with the within-cell variation as such but 
with the linear, quadratic, and higher 
components in a trend analysis. For ex- 
ample, in the F test of Treatments-linear, 
the null hypothesis which is being evalu- 
ated states that the TreatmentsX$- 
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Linear portions (the appropriate error 
estimate) are normally distributed, in- 
dependent, with a single o°. Thus the 
variation of the linear values within T;, 
T. ... T, (after removing the effects 
of Treatments-linear and of Ss) should 
be equal. Each of these variations is an 
estimate of c,?--e,,.57. Likewise, in the 
F test of Treatments-quadratic it is the 
variation of the quadratic values within 
T, Ta . . « Tn (after removing the ef- 
fects of Treatments-quadratic and of Ss) 
which should be equal. Each of these 
variations estimate o, + otsi)", etc. 

The trends do not have to be the same 
for each level of the random effect (Ss, 
blocks) because the mixed model allows 
for different trends to occur in testing 
the fixed effect. The interaction (differ- 
ential trends for each level of the ran- 
dom effect) is included in the E(MS) of 
the treatments effect. Thus the test of 
this effect is above and beyond the dií- 
ferential trends which are allowed. 

Homogeneity of components of trend 
is theoretically inappropriate and un- 
realistic except for the special case when 
all Treatments X S components are non- 
existent. To be theoretically appropriate 
to the specific null hypotheses of concern 
would require the assumption that each 
interaction component is an estimate of 
the same thing. However, the E(MS) 
indicates that each estimates c, plus 
a unique parameter. Linear estimates 
c," oi; quadratic, otata; and 
cubic, o?+otec)*» Contrary to this 
argument one might maintain that each 
does contribute to and estimate c,^- 
ot, which is certainly true. However, 
if one is interested in separating and 
evaluating trends (rather than the usual 
nontrend analysis), then these unique 
parameters assume unique importance 
relative to the null hypotheses and the 
c,? parameter is of no importance. 
Changing from a nontrend analysis to a 
trend analysis requires changing one's 
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orientation. The on? parameter would 
be important only if the trials effect is 
unpartitioned. 

Furthermore, involved in this assump- 
tion is the requirement that ej, 
ví =w. This is an unrealistic 
assumption in that frequently only the 
first, or first and second components, 
will be present. Or if all components 
are present, they may not be equal. Ob- 
viously, as the number of points on the 
curve increases, the greater is the proba- 
bility that they are not all present and 
equal. 

To indicate the dangers involved in 
this homogeneity assumption and pool- 
ing of components procedure, let us take 
some simple examples. 

Example 1. First we will take a simple 
one-factor experiment with four qualita- 
tively different treatments (T5; Ts; T3; 
T.) and five different Ss within each of 
the treatments. The 3 degrees of free- 
dom for treatments are partitioned into 
three orthogonal components, each with 
df=1. The error by which each ortho- 
gonal component is tested is the common 
or pooled between-Ss, within-treatments 
effect. The pooling is justified on the 
basis of the assumption that each within- 
treatments component is an estimate of 
the same parameters, oe? +o? (errors of 
measurement plus sampling error). 

Example 2. We have a Treatment x Ss 
design and the treatments dimension will 
be quantitative. Table 7 shows the 
E(MS) for this design. The appropri- 
ate F test of the linear components for 
treatments is the linear portion of the 
interaction; quadratic is tested by the 
quadratic components; etc., as Grant 
had indicated. An important point to 
note is that each of the components of 
the interaction is not estimating the 
same thing. Each mean square is an 
estimate of c,? plus a unique parameter. 

One might be tempted to pool the 
various interaction components in Ex- 
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ample 2 under the assumption of homo- 
geneity as in Example 1 as is done by 
Edwards, Lewis, Lindquist, and Snede- 
cor. However, this pooling is negated 
in that in Example 1 the within varia- 
tions were each estimates of o,?+0,", but 
in Example 2 each component is an esti- 
mate of c,? and an extra unique portion. 
Each of the components is not an esti- 
mate of the same thing, i.e., Treat- 
ments x S-Linear, Treatments x S-Quad- 
ratic, Treatments x S-Cubic, etc., are not 
estimates of o? +ots*; they are each an 
estimate of c° plus a component which 
is a part of ot. 

Thus it follows that the unpartitioned 
error procedure is most appropriate in 
the type of trend analysis of concern 
here and that homogeneity of compo- 
nents (as usually expressed) is inappro- 
priate inasmuch as the various compo- 
nents do not estimate the same thing. 
Homogeneity would be meaningful only 
if they were estimates of the same thing. 

Let us take another example to show 
that homogeneity is inappropriate and 
unrealistic and if utilized can lead to 
biased F tests. Assume again that we 
have a Treatments X Ss design with four 
treatments, thus providing linear, quad- 
ratic, and cubic components. 

1. Assume ots) —01(0 = tse) —0. 
This is a special case in which the un- 
partitioned procedure would be correct 
and would be more powerful because 
greater degrees of freedom are available. 
Homogeneity of each mean square would 
be of concern because each component 
estimates only o,”. 

2. Assume Gts) — 010) =Ttsco) > 0. 
If all three components are present and 
equal and the unpartitioned procedure 
were used, F tests would not be biased. 
This occurs because ot,” is an averaging 
of the three components—ot,1)*, etsi, 
c1,/,y— and if the three are equal, of,” 
is equal to any one of them. However, 

even though unbiased F tests would re- 
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sult, the homogeneity of components can 
be considered inappropriate because 
each component is estimating different 
variances, 

As an analogy to this case, take a 
four-factor design consisting of A, B, C, 
and Ss. Let us assume that cas’ =op = 
cs? > 0 and that the coefficients for each 
one are also equal. The mean squares 
for AS, BS, and CS are each estimating 
oè, but each is also estimating oas’, Ons"; 
OF c,,?. One would ordinarily not pool 
these three under the assumption of 
homogeneity to test A, B, and C unless 
all variance components except c," were 
assumed to be zero. Thus, why pool the 
mean squares for linear, quadratic, etc., 
of the interaction to test the components 
of the main effect? 

3. One could take other cases, e.g. 
when two components are equal and 
greater than zero while the third is equal 
to zero, one is greater than zero and two 
are nonexistent, etc. However, in all 
these possibilities F tests using the un- 
partitioned procedure would be biased. 

For example, assume that eu— 
c1 45!» 0 and oco? =0. The F tests of 
the treatments components using the un- 
partitioned interaction as error would 
be: 


2 2 
Oe + Gt) toD 


Oe + Cts” 


linear E(F) = 


quadratic E(F) = gott Sua) to «a 

Oe T Ots 

H e" o, ono)! t oto? 

cubic E(F) ESETA Ri 
Since oy? is an averaging of the interac- 
tion components, ts? € ots eu € 
Otsa, and eu?» otac)? Thus F tests 
of linear and quadratic portions would 
be positively biased whereas the test of 
the cubic portion would be negatively 
biased. 

As the number of points on the curve 

increases, the less realistic is it to assume 
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that all interaction components are esti- 
mates of o,? (or if one insists on pooling 
the components under the assumption of 
homogeneity, that the c? due to the 
linear and higher components are pres- 
ent but equal). Thus the probability 
should be great that one or a few of the 
components will be estimating only c, 
(or all o? components of the trend are 
present, but unequal), leading to posi- 
tively and negatively biased F tests. 


Some CAUTIONS WITH TREND ANALYSIS 


We have stressed the orthogonal poly- 
nomial trend analysis in a wholly accept- 
ing manner so far. However, there are 
several important cautions which an in- 
vestigator should consider in a trend 
analysis. 

1. Even though we have proceeded to 
test each of the linear and higher com- 
ponents in each design, the investigator 
should have some a priori basis for mak- 
ing certain F tests rather than all. The 
components which contribute signifi- 
cantly (according to F tests) should be 
accepted only if there are good a priori 
reasons for expecting them or plausible 
ad hoc reasons for not discounting them 
(Lewis, 1960). However, the conclu- 
sions should be based primarily on sta- 
tistical tests. Whenever the investigator 
is doing preliminary work with no de- 
finitive a priori notions about the prob- 
lem, obviously he must rely completely 
on statistical tests. 

A related problem is that the ortho- 
gonal polynomial procedure may not 
provide the best fit. This procedure con- 
siders y as a function of x, x°, +7, etc. 
However, y may be a function of x° (or 
x to some exponent) and another func- 
tion would be more appropriate than 
would a polynomial function. 

2. A serious problem can occur with 
the repeated measurements design. One 
of the critical requirements for the use 
of univariate analysis of variance is that 
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of zero (or constant) correlation be- 
tween treatments. Unfortunately this 
requirement is not met in the repeated 
measurements design when each S is 
used for every treatment. This lack of 
constant correlation will lead to biased 
F tests and distort the probability levels 
of these tests (see Gaito, 1961). How- 
ever, Box (1954) and Geisser and 
Greenhouse (1958) have developed pro- 
cedures to overcome these distortions 
which involve reducing the degrees of 
freedom for the F tests. These proce- 
dures provide for conservative F tests; 
however, Geisser and Greenhouse main- 
tained that they may be too conservative. 

Thus one should be cautious in the 
use, and interpretation of the results, of 
trend analysis when repeated measure- 
ments designs are involved. The results 
should be considered as an approxima- 
tion but these procedures would still 
provide useful information relative to 
trends. 


CONCLUSIONS 


The use of the unpartitioned interac- 
tion as the error term for testing the 
linear, quadratic, and other components 
of the quantitative main effect is inap- 
propriate if one looks carefully at the 
E(MS) of the various sources included 
in the experiment. 

Furthermore, we maintain that as- 
sumptions concerning homogeneity of 
cell variance, homogeneity of each S's 
trend, and homogeneity of linear, quad- 
ratic, and higher components used to 
justify the unpartitioned error procedure 
are irrelevant or theoretically inappro- 
priate and unrealistic to F tests of con- 
cern. The discussion by Grant (1956) 
of trend analysis procedures using parti- 
tioned error terms appears to be more 
appropriate in meeting the requirements 
of the model than are other treatments 
of this subject. 
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DIRECTIVENESS AND NONDIRECTIVENESS IN 
RESEARCH INTERVIEWING: 


A REFORMULATION OF THE PROBLEM * 
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ann STEPHEN A. RICHARDSON 
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A framework is presented for the design of research interviews, based 
on the principle that throughout an interview, no matter whether direc- 
tive or nondirective, the interviewer prescribes various aspects of the 
style of the informant's response in each question he asks. This frame- 
work is used to analyze 3 kinds of research interviews in current use, and 
to clarify the purpose for which each is designed. It is suggested that 
controversies about the relative merits of these interview designs can 
be resolved by systematic study of the effects of interviewers’ prescrip- 
tions on the reliability and validity of informants’ responses. Such 
studies might also suggest other interview designs besides those now 


being used. 


The responses an informant gives to 
an interviewer can be strongly influ- 
enced by the way questions are asked. 
Although this fact has been a subject 
of concern and study since the turn of 
the century,? research workers have not 
agreed on what to do about it. They 
agree that influence by the interviewer 
can introduce undesirable bias into in- 
formants’ responses and should there- 
fore be eliminated (e.g., Kinsey, Pome- 
roy, & Martin, 1948; Rogers, 1945), 
but they differ when it comes to design- 
ing an interview to achieve this purpose. 
Although some advocate strictly non- 


1The research reported in this paper was 
done as part of the Field Methods Training 
Program, supported by the Cornell University 
Social Science Research Center and the Car- 
negie Corporation, with an Advisory Commit- 
tee including Urie Bronfenbrenner, J ohn Dean, 
John Harding, Robert J. Smith, and William 
F. Whyte, A book based on the results of this 
research will be published by Basic Books, In- 
corporated. For their helpful criticism of this 
paper, we wish to thank Bruce Dohrenwend 
and Lawrence Nyman. 

?For instance, from 1903 to 1906 William 
Stern edited a periodical devoted to this prob- 
lem: Beitrage zur Psychologie der Aussüge, 
published by J. A. Barth in Leipzig. 


directive interviewing (e.g, Merton, 
Fiske, & Kendall, 1956), others employ 
highly directive techniques (e.g., Kinsey 
et al., 1948), and still others advocate a 
middle course (e.g., Maccoby & Mac- 
coby, 1954). Our purpose in this paper 
is to suggest a method for resolving this 
disagreement based on a different ap- 
proach to the design of research inter- 
views. 

The difference as to the degree of 
directiveness or nondirectiveness needed 
in research interviews reflects disagree- 
ment about what constitutes undesir- 
able influence on an informant. At one 
extreme are those who support Rogers’ 
(1945) recommendation that, in re- 
search interviews, the topic of the inter- 
view should be introduced by the inter- 
viewer, but thereafter the informant's 
responses should be allowed to guide the 
interview. This procedure aims to pre- 
clude the interviewer's influencing the 
informant in two ways. First, the inter- 
viewer avoids influencing the informant's 
response to any given question in one 
direction or another. Second, he allows 
the informant to determine the course 
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of development of topics in the inter- 
view. 

A narrower conception of what con- 
stitutes undesirable influence on the 
informant was described by Maccoby 
and Maccoby (1954). Discussing 
probes to supplement prepared questions 
in an interview schedule, they indicated 
agreement with Rogers’ position that 
the interviewer should not influence the 
direction of responses to particular ques- 
tions. In contrast to Rogers, however, 
they stated that the interviewer is re- 
sponsible for guiding the informant to 
particular topics so that information will 
be obtained on predetermined dimen- 
sions of the research problem. Thus, 
these two views agree only in proscrib- 
ing techniques which might influence the 
response to a particular question in one 
direction rather than another. 

Even this proscription was not ac- 
cepted by Kinsey and his colleagues 
(1948). Despite their assertion that 
“questions are not so phrased as to bias 
the subject’s reply [p. 52],” they re- 
ported asking their informants when 
they first engaged in various sexual 
activities, rather than whether they had, 
in order to make it more difficult for the 
informants to deny their sexual experi- 
ence. Kinsey and his colleagues as- 
sumed that, when informants are ex- 
pected to be evasive, questions which 
direct them toward one response rather 
than another exert a desirable influ- 
ence. 

The widespread criticism of the direc- 
tive interviewing techniques used by 
Kinsey and his colleagues (Cochran, 
Mosteller, & Tukey, 1954) might sug- 
gest that they had violated a universally 
accepted rule of interview design. Their 
procedure was, however, not actually as 
unusual as this reaction would suggest. 
For example, Smith, Bruner, and White 
(1956), in their intensive study of the 
opinions and personalities of 10 men, 
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tested the limits of their subjects' opin- 
ions in an interview in which they argued 
with the subject and attacked the in- 
consistencies in his position. This pro- 
cedure was, they noted, quite different 
from the nondirectiveness of many of 
their other procedures. 

For extensive study of public opinion, 
rather than intensive study of the struc- 
ture of individuals! opinions, Litvak 
(1956) also proposed that directive 
questions be used to test the limits of 
informants' opinions. He suggested, for 
example, that the loaded question 
“Don’t you agree that union leaders are 
racketeers? [p. 182]" could be used to 
identify strong prounionists, since only 
these informants would respond in the 
negative. Informants with moderately 
prounion opinions would not be ex- 
pected to resist the antiunion bias of the 
question. 

When informants’ commitment to à 
given opinion is already known, Beezer 
(1956) found that directive questions 
could be used to induce them to talk 
freely. For example, when East German 
refugees who the investigator knew were 
paying high prices for food were asked, 
"I understand you don’t have to pay 
very much for food in the East Zone 
because it is rationed [p. 13]," they 
generally went to some length to en- 
lighten the interviewer about conditions 
in East Germany. 

Another indication of the extent to 
which directive questions are actu 
used is provided by Richardson's (1960) 
finding, in interviews by seven expeti- 
enced, professional interviewers, t 
22-40% of their questions were leading. 
Thus, if we start with the problem of 
designing interviews so that interviewers 
exert no undesirable influence on inform- 
ants’ responses, we find no point 
agreement among practitioners as t0 
how this should be done. 

We suggest that this impasse concern- 
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DIRECTIVENESS AND NONDIRECTIVENIESS IN INTERVIEWING m 
ing the design of research interviews can the desired length of by diet 
be overcome by turning our attention d b quas dud quagrth 
from the problem of nullifying the inter- — general, a closed question can always be 
viewer's undesirable influence to deter- answered in a few words, whereas an 
mining how he does, in fact, influence open question generally requires more 
the informant. We base this proposal than a few words for an adequate 
on the assumption that, in any kind of answer. More specifically, there are 
interview, whether directive or nondirec- three types of closed questions, labeled 
tive, the interviewer necessarily dictates the selection question, the yerno quer- 
the style of the informant's responses tion, and the identification question, to 
by means of prescriptions implicit in indicate the particular kind of response 


tions, we shall show, are the elements of 
which different kinds of interviews are 


the effectiveness of different kinds of in- 
terviews, 

Assumptions about the function of the 
person who is interviewed are implicit in 
the term used to describe him. In survey 
research the commonly used term “re- 
spondent" implies an expectation of 
brief responses to questions posed by 
the interviewer. In anthropological and 
other forms of nonscheduled interviews 
the usual term "informant" implies the 
provision of information on more general 
and complex topics such as organiza- 
tional structure or life histories, We 
shall use informant for all persons inter- 
viewed—including respondents. 

Prescriptions IMPLIED BY INTER- 

VIEWERS' QUESTIONS 

Three aspects of the informant's re- 
sponse style prescribed by the inter- 
viewer's questions are length, method of 
topic selection, and freedom of choice of 
response (Dohrenwend & Richardson, 
1956). The interviewer also prescribes 
such response properties as subject 
matter, scope, and function (e.g., clarifi- 
cation, extension) of response, but these 
prescriptions are not central to the prob- 
lems under discussion. 


Prescription of Length of Response 
The interviewer's questions prescribe 


$ 


| 
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Prescription of Responsibility for Topic 
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duced by the informant in a recent 
response. 


Prescription of Direction of Informants’ 
Response 


The third major kind of prescription 
in each of the interviewer’s questions 
concerns the informant’s freedom of 
choice in responding. There are two 
distinct forms of this prescription. First, 
the interviewer may or may not indicate 
that he expects the informant to give 
one particular response. A question 
which does prescribe that the informant 
give a particular response is conven- 
tionally labeled /eading or loaded. An 
example is: “You are going to vote 
Democrat, aren't you?" In contrast a 
nonleading form of the question is: “Are 
you going to vote Democrat or Repub- 
lican?" 

The second form of prescription is 
conveyed by an assumption made by the 
interviewer. The informant can answer 
the question in this form only by accept- 
ing the assumption. If he rejects the as- 
sumption he must reject the question. 
An example of a question which em- 
bodies an assumption is Kinsey’s 
(1948): “asking when they first en- 
gaged in such activity [p. 53]." Clearly, 
the alternative is to ask informants 
whether they have ever engaged in such 
activity. In general, however, the dis- 
tinction is not between questions which 
do include an assumption and those 
which do not, since many questions are 
based on some kind of assumption about 
the informant's knowledge or experi- 
ence. Rather, the distinction is between 
questions which, on the one hand, make 
an assumption not justified by any in- 
formation the interviewer has about the 
informant and, on the other hand, ques- 
tions which either make no assumption 

or make an assumption justified by in- 
formation already available about the 
informant. In this sense, Kinsey's ques- 
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tion includes an assumption. If, how- 
ever, it had been asked only of inform- 
ants who said they had engaged in a 
particular activity, it would not have 
included an assumption. 

Since the terms leading or loaded have 
been applied to Kinsey’s question and 
to those like the voting example, we shall 
use other terms to designate them. The 
latter form of leading question we label 
a suggestion and the alternative a non- 
suggestion. Kinsey’s form of leading 
question is an assumption and the alter- 
native a nonassumption. A given ques- 
tion can be both a suggestion and an as- 
sumption, one or the other, or neither. 


ANALYSIS OF THREE INTERVIEW 
DESIGNS 


Differences and similarities among the 
three major interview designs currently 
employed in behavioral research are 
obscured by such oversimplified labels as 
directive and nondirective. In order to 
see how they differ, and, ultimately, to 
understand why they differ, we will 
analyze the pattern of interviewers’ pre- 
scriptions characteristic of each inter- 
view design. 

The first design to be analyzed is ex- 
emplified by the opinion survey and will 
be labeled, for reasons to be explained, 
the limited-response interview. The 
second goes under a variety of names 
such as semistructured (Argytis, 1960), 
informal (Bingham, Moore, & Gustad, 
1959), and clinical or nondirective 
(Nahoum, 1958). We shall call this the 
free-response interview. The third is the 
directive type used by Kinsey and his 
colleagues (1948) and under the title, 
Stress Interview, by Smith, Bruner, a? 
White (1956). This we shall call the 
defensive-response interview. 


Limited- Response Interview 


The labels we have given to the three 
interview designs indicate the genet 
task assigned to the informant by means 
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of prescriptions in interviewers' ques- 
tions. In the limited-response interview 
the informant is expected to fill in a 
more or less detailed outline, drawn by 
the interviewer, without challenging the 
shape of that outline. In an opinion 
survey using this type of interview, for 
example, the informant is asked to give 
his opinions on issues defined by the 
interviewer, and generally in terms 
specified by the interviewer, but not to 
determine which of his opinions are 
relevant to the interviewer's inquiry. 
The interviewer defines this task, first, 
by prescribing length of response. 
Although the composition of limited- 
response interviews in terms of open and 
closed questions varies, we include in 
this category only interviews in which 
closed questions predominate over open 
questions. Our examination of published 
interviews shows a range from about 
two closed questions for each open ques- 
tion (e.g., Gurin, Veroff, & Feld, 1960) 
to as high as 15 closed for each open 
question (e.g., Berelson, Lazarsfeld, & 
McPhee, 1954) in the main body of the 
interview, excluding closed background 
questions. In addition, so-called poll in- 
terviews may consist entirely of closed 
questions. Thus, although limited-re- 
sponse interviews are not homogeneous 
in their use of closed and open questions, 
they can be characterized as prescribing 
that, for the most part, the informant 
should limit himself to brief responses. 
The second prescription of the in- 
formant’s task in limited-response in- 
terviews concerns responsibility for topic 
selection. Generally in this type of in- 
terview, the interviewer has a prepared 
list of questions which he asks in a pre- 
determined order. Thus most, though 
not necessarily all, of the interviewer's 
questions appear to the informant to be 
concerned either with a topic which the 
interviewer has introduced in a recent 
question or with a new topic. That is, 
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when the interviewer is required to ask 
questions in the exact words and se- 
quence given in a prepared schedule, a 
question can appear to assign responsi- 
bility for topic selection to the in- 
formant only when it is so phrased that 
the informant may be uncertain about 
whether it was one of the interviewer's 
prepared questions. For example, in the 
interview used by Gurin, Veroff, and 
Feld (1960) to study mental health, we 
find that about 696 of the questions 
could prescribe that the informant take 
responsibility for topic selection. 

When the interviewer is allowed some 
freedom to change the wording and se- 
quence of questions, the proportion of 
questions which appear to assign topic 
selection to the informant may be as 
high as one third (e.g., Dembo, Leviton, 
& Wright, 1956). Nevertheless, if the 
interview is of the limited-response type, 
questions assigning responsibility for 
topic selection to the informant are 
clearly in the minority. 

Our analysis of the questioning pro- 
cedure in the limited-response interview 
shows that the great majority of ques- 
tions implicitly instruct the informant 
both to limit the length of his response 
to a few words and to limit the content 
of his response to the topic designated 
by the question. Previous work has 
shown that both these prescriptions 
are generally observed by informants 
(Dohrenwend & Richardson, in press). 
Thus, the limited-response interview can 
be characterized as one in which, in line 
with the interviewer's prescriptions, in- 
formants tend to restrict themselves to 
short responses on topics designated by 
the interviewer. 

Although textbook writers have gen- 
erally indicated that limited-response in- 
terviews should be free of assumptions 
and suggestions (e.g., Kornhauser & 
Sheatsley, 1959), Litvak (1956) has 
convincingly argued that assumptive and 
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suggestive questions are, in practice, 
employed to good use under labels such 
as preference measure, or extreme item. 
Therefore, although they are not the 
dominant form of question in limited- 
response interviews, assumptions and 
suggestions are both used when they suit 
the investigator's particular measure- 
ment needs. 


Free-Response Interview 


In the free-response interview, the 
informant’s task is to build a picture 
around one or more points of orienta- 
tion provided by the interviewer. In 
contrast to the limited-response inter- 
view, the free-response interview does 
not explicitly define boundaries for the 
informant. He is expected, however, to 
maintain contact with the central focus 
or foci of the interview. 

The free-response interview differs 
from the limited-response interview 
both with respect to prescription of 
length of response and prescription of 
responsibility for topic selection. For 
prescription of length of response the 
interviewer uses more open than closed 
questions. Published descriptions of in- 
terviews of this type indicate, in one 
instance, a rate of about four open to 
three closed questions (Adorno, Frenkel- 
Brunswik, Levinson, & Sanford, 1950), 
and in another instance a ratio of three 
to two (Argyris, 1960) among ques- 
tions prepared before the interview. 
Supplementary questions are guided by 
instructions to minimize closed ques- 
tions (e.g., Argyris, 1960). 

While closed questions are in the 
minority they are still used quite fre- 
quently in the free-response interview. 
This conclusion is based on our having 
observed, in interviews in which open 
questions were supposed to be used ex- 
clusively, that experienced interviewers 
were unable to avoid introducing closed 
questions. The problem seems to be that 
open questions provide insufficient feed- 
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back to the informant about the inter- 
viewer's comprehension of his responses 
and thus tend to make the informant in- 
secure, For this reason, some free- 
response interviews may contain no 
more than a bare majority of open over 
closed questions. 

Descriptions of free-response inter- 
views stress that the interviewer helps 
the informant to describe his experi- 
ences and feelings in the context that 
is meaningful to him (e.g., Merton et 
al, 1956). In terms of prescriptions, 
the rule in these interviews is that as 
many questions as possible prescribe 
that the informant take responsibility 
for topic selection. Unfortunately, this 
type of question creates a problem be- 
cause of its secondary effect on inform- 
ants' responses. (Dohrenwend & Rich- 
ardson, in press). When the interviewer 
prescribes that the informant take re- 
sponsibility for topic selection, the in- 
formant tends not only to volunteer 
information relevant to the purpose of 
the interview, as desired, but also to 
introduce irrelevant digressions. In the 
free-response interview, therefore, the 
interviewer faces the problem of mini- 
mizing irrelevancies without discourag- 
ing the informant from taking a large 
measure of responsibility for topic selec- 
tion. 

We have evidence that certain types 
of suggestions seem, surprisingly, 19 
help solve this problem (Dohrenwen 
Richardson, in press). Although the use 
of suggestions is often explicitly dis- 
couraged in free-response interviewing 
(eg., Whyte, 1953), there is reason to 
believe that they are, in practice, use 
by some experienced interviewers 
(Richardson, 1960), Furthermore, cèt- 
tain kinds of suggestions seem to have 
the unexpected property of inducing 1- 
formants to volunteer information not 
directly requested without, at the same 
time, encouraging an unusual number 0} 
digressions from the topic of the inter- 
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view. The particular suggestions which 
seem to have these effects are either 
those which are correct—that is, direct 
the informant to the response he would 
have given anyway—or those which are 
grossly incorrect, but not those which 
are only slightly incorrect. Correct and 
grossly incorrect suggestions appear, 
therefore, to function in the free- 
response interview as supplements to the 
direct prescription that the informant 
take responsibility for topic selection. 

Thus, the questioning procedure in 
the free-response interview prescribes 
that the informant should feel free to 
give lengthy responses on topics which 
he develops from those introduced by 
the interviewer. This interview differs 
from the limited-response interview in 
being designed to give the informant a 
large measure of freedom and responsi- 
bility for the formulation of his re- 
sponses. 


Defensive-Response Interview 


In the defensive-response interview, 
the informant is faced with an inter- 
viewer who tries to force him into a 
particular position in each of a number 
of areas. The informant is expected to 
defend himself against being pushed 
into any position which, in fact, does 
not fit. For example, if asked when he 
first engaged in a particular activity, he 
is expected to say that he never did if 
this is actually the case. 

There are few published examples of 
the defensive-response interview in re- 
search, Kinsey, Pomeroy, and Martin's 
(1948) description of their interview 
suggests, however, that a major weapon 
in the interviewer's attack on the in- 
formant is the unjustified assumption. 
In addition, incorrect suggestions are 
probably effective in developing the 
argumentative quality attributed by 
Smith, Bruner, and White (1956) to 
their Stress Interview. Thus, the defen- 
sive-response interview is characterized 


by restriction of the informant’s free- 
dom of choice of response. 

Another characteristic of this inter- 
view is suggested by the report, con- 
cerning the Kinsey interview, that ques- 
tions were asked: “in an order which 
seems . . . hard to predict, so that it 
is difficult to tell what is coming next 
{Cochran et al., 1954, p. 20].” That is, 
the interviewer prescribes that he, 
rather than the informant, should take 
responsibility for topic selection. The 
informant is expected simply to deal 
with each topic as the interviewer intro- 
duces it. 


PURPOSES OF THE THREE 
INTERVIEW DESIGNS 


Research workers differ about the 
merits and limitations of the three in- 
terviews we have described, their at- 
tempts to resolve these differences gen- 
erally taking the imperialistic form of 
arguing that one design is adequate to 
all purposes (e.g., Lazarsfeld, 1944). 
For reasons which we present in the re- 
mainder of the paper, this seems to us 
an unlikely settlement of the contro- 
versy about interview design. 

Comparison of the purposes of the 
three interviews has been confused to 
some extent by two spurious arguments 
for rejecting one or another design. The 
first argument, which deserves more at- 
tention than we can give it in this paper, 
is that only the limited-response inter- 
view provides standardized data (e.g. 
Hyman, 1954; Maccoby & Maccoby, 
1954). It is argued that unless identi- 
cal question order and wording are used 
with all informants, the data cannot be 
considered standardized. However, the 
alternative possibility, that psychologi- 
cal equivalence of meaning is sometimes 
more nearly approached by adjusting 
the wording and order of questions to 
the informant, cannot be rejected 
(Cochran et al., 1954; Merton et al., 
1956). Therefore, we shall assume that 
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any of the three interview designs can 
be considered for collecting standardized 
data. 

Second, there is a misconception about 

the defensive-response interview, appar- 
ently based in part on the suggestion by 
Kinsey and his colleagues (1948) that 
their interviewing resembled that of 
“detectives and other law-enforcement 
officials [p. 54]." From this characteri- 
zation some critics have concluded that, 
in this form of interview, the interviewer 
cannot achieve the informal, permissive 
relationship generally considered es- 
sential to good research interviewing 
(e.g., Wallin, 1949). The fact that the 
interviewer conducts a cross-examina- 
tion, however, need not imply that he 
treats the informant as an accused male- 
factor. He may suggest that the inform- 
ant has done something, or holds a par- 
ticular belief, without indicating at the 
same time that the act or belief is repre- 
hensible. Furthermore, informants of 
Kinsey and his colleagues have clearly 
indicated that their interviews appeared 
to them to be “a friendly conversation 
[Cochran et al., 1954, p. 20]," or the 
interviewer “a scholarly gentleman of 
humor and charm  [Skinner, 1950, 
p. 31].” The defensive-response inter- 
view need not, therefore, be different in 
feeling tone from any other form of re- 
search interview, 

The real differences in the purposes of 
the three interview designs are revealed 
by the ways in which the interviewer’s 
prescriptions concerning response style 
set different tasks for the informant in 
each interview. Examination of these 
tasks will indicate the basis for research 
workers’ disagreements about the merits 
and demerits of each design. 


Limited- Response Interview 


In the limited-response interview the 
informant's task is to fill in an outline 
provided by the interviewer's questions. 
This task is accomplished, for the most 


part, by brief responses, such as “yes,” 
or “fairly strongly," or “three years,” 
which, separated from their questions 
obviously have no meaning. If these re- 
sponses are to be valid, the questions 
must mean to the informant exactly 
what they mean to the investigator. The 
limited-response interview ^ appears, 
therefore, to be appropriate when in- 
vestigators and informants share a com- 
mon vocabularly relevant to the issues 
and alternatives to be included in the 
interview, 

The most obvious use for the limited- 
response interview, and one for which it 
is recommended by experienced research 
workers (e.g., Albig, 1957), is the study 
of opinions on public issues such as elec- 
tions and major political policy ques- 
tions (e.g., Berelson et al., 1954; Lazars- 
feld & Thielens, 1958). It is also used, 
however, for studies concerned with less 
clearly articulated issues and alterna- 
tives, and here some investigators ques- 
tion the adequacy of the limited-re- 
sponse interview. For example, a re- 
viewer of Americans View Their Mental 
Health, while accepting the basic value 
of the study, noted the problem of inter- 
pretation raised by “variations in mean- 
ings of terms like happiness, worry, 
personal problems, etc. [Nunnally, 1961, 
p. 264].” That is, these crucial terms in 
the interview may not have had the same 
meaning for the investigators and all 
their informants, 


Free-Response Interview 


The free-response interview is in- 
tended to help the informant to examine 
and report the meaning to him of his 
experiences and feelings (e.g., Merton 
et al., 1956). Thus, this type of inter- 
view is designed to overcome the weak- 
ness which some research workers attrib- 
ute to the limited-response interview. 
By enabling the informant to describe 
his meanings to the interviewer, it avoids 
the assumption that the interviewer can 


DS 


A 


DIRECTIVENESS AND NONDIRECTIVENESS IN INTERVIEWING 


understand them without explanation. It 
appears, therefore, to be designed for the 
very topics for which the limited-re- 
sponse interview is most likely to be con- 
sidered inappropriate—that is, ambigu- 
ous public issues or private matters. 


Defensive- Response Interview 


The defensive-response interview is 
designed to break down the informant's 
resistance to reporting certain experi- 
ences or expressing certain opinions. 
That is, for topics entangled in strong 
social norms as to what is taboo and 
what is socially desirable, this kind of 
interview has been offered as a solution, 
notably by Kinsey, Pomeroy, and Mar- 
tin (1948). 

Many investigators argue, however, 
that Kinsey and his colleagues are wrong 
in assuming that the defensive-response 
interview is needed to overcome resist- 
ance. They suggest that the incorpora- 
tion of indirect questions, particularly 
projective questions, into the limited- 
response or the free-response interview 
is sufficient to deal with problems of 
resistance (e.g., Argyris, 1960; Mac- 
coby & Maccoby, 1954). 

Since both of these positions involve 
problematical assumptions, we are not 
prepared to choose between them. The 
indirect questions present the obvious 
problem of interpreting, and possibly 
misinterpreting, the meaning of the re- 
sponses. On the other hand, the in- 
vestigator using the defensive-response 
interview must assume an absence of 
suggestibility in his informants, an as- 
sumption made questionable by recent 
evidence that this characteristic in some 
individuals cannot be eliminated by 
manipulating situational factors (Hov- 
land & Janis, 1959; Stukat, 1958). 

All three of the interview designs we 
have discussed are controversial in one 
way or another. Implicit in these con- 
troversies, however, is agreement that 
the objective of any interview is to pro- 
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duce reliable and valid responses. The 
disagreements concern the question of 
effective means for achieving this end. 
It seems to us that this debate is not 
likely to be resolved by comparing com- 
plex interview designs directly but must 
be translated into more specific terms to 
make it amenable to study. The terms 
we suggest are interviewers' prescrip- 
tions concerning informants' response 
style. An understanding of how inter- 
viewers’ prescriptions affect the reliabil- 
ity and validity of informants’ responses 
could provide a sound basis for evaluat- 
ing the various interview designs built 
from these component prescriptions. 


EFFECTS OF INTERVIEWERS’ PRESCRIP- 
TIONS ON RELIABILITY AND VALIDITY 
OF RESPONSES 


Surprisingly little systematic work has 
been done on the problem of how inter- 
viewers’ prescriptions affect response 
reliability and validity. There is some 
research on the effects of prescriptions 
of length of response, but most of it has 
been limited by its exclusive con- 
cern with the limited-response interview. 
Within this framework, some investi- 
gators have studied the effects of open 
and closed questions on survey inter- 
viewers’ interpretation and recording of 
informants! responses (eg. Hyman, 
1954), rather than their direct effects on 
responses. When the effect of prescrip- 
tion of length on responses as such has 
been studied, the comparison has in- 
volved a narrow range of prescriptions, 
that is, yes-no as against selection or 
identification types of closed questions 
(e.g., Cantril, 1947), rather than closed 
as against open questions. The hypoth- 
esis that, in some circumstances, open 
questions yield more valid responses 
than closed questions, which is implicit 
in the design of the free-response inter- 
view, has not to our knowledge been 
tested under controlled conditions. 

Even more neglected has been the 
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systematic study of the effect of pre- 
scription of responsibility for topic selec- 
tion on the reliability and validity of in- 
formants’ responses. Rogers (1945) 
reports that validity is greater when the 
informant is assigned this responsibility 
than when the interviewer takes it him- 
self, but does not report substantiating 
evidence. Stern (1938) found that tes- 
timony given as free narrative contained 
fewer errors than testimony given in re- 
sponse to detailed questioning. This 
comparison indicates that validity of 
responses is affected in the case of the 
extreme contrast between complete re- 
sponsibility for topic selection being in 
the informant's hands (free narrative), 
as against complete responsibility being 
in the interviewer's hands (detailed 
questioning). This evidence could pro- 
vide a starting point for investigations 
into questions such as: How do inter- 
mediate positions, in which, to varying 
degrees, interviewer and informant share 
responsibility for topic selection com- 
pare with the extreme cases of complete 
responsibility being on one side or the 
other? How does the effect of assign- 
ment of responsibility for topic selection 
vary with the subject matter of the 
inquiry? How does it vary with the type 
of informant? 

The most thoroughly debated, if, 
again, not so thoroughly studied, inter- 
viewing techniques are those which pre- 
scribe restrictions on the informant's 
freedom of choice of response—namely, 
suggestions and assumptions. We can 
start with the presumption that both 
suggestions and assumptions can lead 
informants to produce invalid responses, 
though we have evidence that this effect 
does not occur with all informants under 
all conditions (Richardson, 1960). The 
problem is whether, as some critics have 
suggested, more valid responses are pro- 
duced in some circumstances by sugges- 
tions or assumptions than by nonsugges- 
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tions and nonassumptions. edi 
attention might be given to the h 
esis suggested by Kinsey and his 
leagues (1948), on the basis of their 
perience, that assumptions provide 
technique for overcoming the influence 
of social desirability on responses, — — 
Ultimately we must return to the fact 
that in actual interviews various kinds 
of prescriptions always operate si 
taneously. The interview designs in cur- 
rent use involve a restricted number of 
combinations of prescriptions, and we 
know almost nothing about either the - 
feasibility or the usefulness of combina-- 
tions outside this range. As we learn 
more about how interviewers’ prescrip- 
tions affect informants’ responses, We 


may find combinations which provide - 
new interview designs to deal with some - 
of the increasingly complex problems of 
data collection in behavioral science. — 
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RUSSIAN THEORY AND RESEARCH ON 
SCHIZOPHRENIA 
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University of Exeter 


Most Russian work follows Pavlov's theory that schizophrenia is due 
to excessive protective inhibition in the cerebral cortex and is devoted 
to demonstrating in detail the truth of this hypothesis, According to a 
large number of reports, schizophrenics differ from normals in orienta- 
tion reactions, sympathetic reactivity, EEG, conditionability, and word 
association tests. There appear to be 2 groups of schizophrenics: 4 
majority group in whom sympathetic tone and reactivity are low, and 
a minority group in whom these are high. 


In the past few years a number of in- 
vestigators in the West have advanced 
theories of schizophrenia in terms of dis- 
orders of arousal (e.g, Lynn, 1962; 
Venables, 1960) or some related concept 
such as anxiety (Mednick, 1958), sym- 
pathetic tone and reactivity (Gellhorn, 
1958), or some biochemical disturbance 
affecting these (e.g., Hoffer & Osmond, 
1960). Russian investigators have made 
a considerable number of studies along 
somewhat similar lines, although pre- 
sented in Pavlovian terminology, and 
the object of the present paper is to re- 
view the work of the last decade for 
Western readers. 


PavLov's THEORY OF SCHIZOPHRENIA 


Russian research on schizophrenia is 
still almost entirely dominated by Pav- 
lovian theory and a brief account of this 
is necessary for an understanding of the 
rationale of the researches to be de- 
scribed. Pavlov (1941) held that in- 
tense or prolonged stimulation of the 
nerve cells induces a state of protective 
or transmarginal inhibition, the purpose 
of which is to protect the cells from fur- 
ther stimulation which would be harm- 
ful to them. When nerve cells are in 
a state of protective inhibition, they do 
not conduct excitation. It is important 
in following Pavlovian theory to distin- 
guish between protective inhibition and 
internal (active) inhibition, which is re- 
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sponsible for the extinction of condi- 
tioned reactions with absence of rein- 
forcement and for discrimination (“dif- 
ferentiation”). Protective inhibition has 
the effect of weakening both the excita- 
tory and (internal) inhibitory processes. 
Hence, there are certain conditions in 
which internal inhibition cannot be gen- 
erated readily because the process has 
been weakened by protective inhibition. 
Pavlov put forward the theory that 
schizophrenia results from the genera- 
tion of protective inhibitions in the cere- 
bral cortex. The protective inhibition 
can be generated as a result of various 
kinds of shock, drugs, or physical ill- 
nesses. There is also an important con- 
stitutional factor in liability to schizo- 
phrenia, namely, the strength of the 
nervous system. This is the extent to 
which the nervous system is sensitive to 
stimulation, weak nervous systems being 
those that are most sensitive. A weak 
nervous system that is sensitive to stimu- 
lation is more likely to become over- 
stimulated, generate protective inhibi- 
tion, and hence, succumb to schizo- 
phrenia. The presence of protective 
inhibition in the cortex accounts for the 
slowness and poor conditionability ° 
schizophrenics. ‘The other variega 
symptoms of schizophrenia are deter- 
mined by the effect of protective inhibi- 
tion in the cortex or the other parts 9 
the brain. In cases of catatonic stupor 
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the protective inhibition spreads down 
to the subcortex and affects also the 
sympathetic nervous system. But in 
those schizophrenic states in which vio- 
lent outbursts occur, Pavlov suggested 
that the subcortex is overexcited, due to 
the removal of cortical control and the 
operation of the law of positive induc- 
tion, according to which inhibition in 
one region of the brain induces excita- 
tion in other areas. Pavlov compared 
this process with the excited outbursts 
of tired children and intoxicated adults, 
arguing that in both cases the cerebral 
cortex is weakened, its control on the 
subcortex diminished, and hence, the 
subcortex is freed and activates the emo- 
tional outbursts. 


ORIENTATION REACTIONS 


A considerable volume of work has 
been done recently in Russia on the so- 
called orientation reaction. The orienta- 
tion reaction covers what is sometimes 
called in the West the arousal reaction; 
i.e., the reaction by which the organism 
pays attention to new stimuli and is 
mobilized to deal with them. It has 
three chief components. First, there are 
changes in skeletal muscles, the animal 
pricks up its ears, turns its body or head 
towards the new stimulus, muscle tonus 
rises, and there is an increase in mus- 
cular electrical activity. Secondly, the 
sympathetic division of the autonomic 
nervous system is activated: there is an 
increase in palmar skin conductance and 
pupil dilation, vasoconstriction in the 
limbs and vasodilation in the head, and 
variable changes in heart and respira- 
tion rates. Thirdly, the electroencephalo- 
gram (EEG) shows an increase in fre- 
quency and the alpha rhythm is blocked. 

The orientation reaction is distin- 
guished from the defensive reaction 
which occurs if the stimuli are intense, 
moderately intense and prolonged, or if 
the subject is in a state of tension. In 


the defensive reaction the subject shows 
signs of being frightened by the stimulus 
rather than interested in it, and there is 
vasoconstriction in the head as well as 
in the limbs. 

A number of the components of the 
orientation reaction in schizophrenics 
were recorded by Traugott, Balonov, 
Kauffman, and Luchko (1958), namely, 
movements of eyes and head, galvanic 
skin response (GSR), respiration rates, 
heart rates, and vascular reactions from 
the shoulder. The stimuli used were 
auditory (a tone, bell, and whistling 
sound), visual (lights), and tactile. In 
chronic deteriorated schizophrenics there 
were often no orientation reactions of 
any kind; where reactions were present, 
however, the autonomic reactions were 
much weaker than the motor compon- 
ents. In hallucinated-paranoid patients 
the size of the orientation reaction and 
its extinction with repeated presenta- 
tion of the stimulus were very variable, 
sometimes being stronger and sometimes 
weaker than in normal subjects. Some 
patients showed a defensive reaction, 
while others showed a poor orientation 
reaction. Reactions to stimulations were 
also observed during the course of in- 
sulin treatment. It was found that in 
the initial stages of treatment the pa- 
tients became overactive and gave de- 
fensive reactions to the stimuli. Later, 
with recovery, the stimuli elicited 
normal orientation reactions. 

A similar experiment is reported by 
Gamburg (1958) on 69 schizophrenics, 
mainly simple and paranoid. Motor and 
autonomic components of the orienta- 
tion reaction were recorded to auditory 
stimuli and electric shock to the fingers. 
It was found that very few of the schizo- 
phrenics gave normal orientation reac- 
tions. Patients diagnosed as simple 
schizophrenics tended to give no reac- 
tion at all, while paranoiacs tended to 
give defensive reactions. In four out of 
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five catatonic patients the initial stimu- 
lus elicited a defensive reaction, but sub- 
sequent stimuli elicited no reaction at 
all. It was also found that when the 
schizophrenics did give an autonomic 
reaction, the autonomic disturbance con- 
tinued much longer than is usual in 
normal subjects. In those patients who 
had not given any orientation reactions 
to stimulation, caffeine restored the re- 
activity, but bromine and luminal had 
no effect. The author interprets this 
finding as supporting Pavlov's theory 
that the lack of reactivity is a result of 
excess inhibition, since it is assumed 
that caffeine dissipates the inhibition but 
bromine and luminal further increase it. 


SvMPATHETIC NERVOUS SYSTEM 


A number of Russian investigations 
indicate that there is a depression of the 
sympathetic nervous system in schizo- 
phrenia, both in its level and its reactiv- 
ity to stimulation (Ekolova-Bagalei, 
1955; Stanishevskaya, 1955; Streltsova, 
1955; Vertogradova, 1955). Ekolova- 
Bagalei (1955) investigated 85 cata- 
tonic patients and reported low sym- 
pathetic tone as assessed by pulse rate, 
respiration rates, blood pressure, pupil 
diameter, sweating and vasometer tone; 
there was also little reactivity to stim- 
ulation. Similar results using the ple- 
thysmograph to measure vascular re- 
flexes to hot and cold stimuli were 
reported by Vertogradova (1955) on 
30 early cases of paranoid and simple 
schizophrenia, 

In an investigation by Streltsova 
(1955) four studies were made of effects 
of stimulation on the pupil reaction in 
schizophrenic patients, the first being 
concerned with determining how far 
stimulation produces the normal reac- 
tion of pupil dilation in schizophrenics, 

The patients studied were 136 schizo- 
phrenics of different kinds (85 men, 50 
women, aged 14-55 years, length of ill- 
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ness 2 months to 25 years). The stimuli 
used to elicit pupil reactions were hot 
and cold pricks (at 45° and 15°C.), a 
bell, and an olfactory stimulus. In 
normal subjects it was found that these 
stimuli elicit pupil dilation of the order 
of an increase of one eighth over the ini- 
tial pupil diameter. Of the schizo- 
phrenics, 27.4% reacted normally. The 
majority of the patients, 65%, showed 
strikingly subnormal reactions to stimu- 
lation; 40% showed no reactions at all 
and in 25% it was greatly reduced. 
These patients were long standing hebe- 
phrenics and hallucinated paranoiacs, 
and simple schizophrenias independent 
of the duration of the illness. The re- 
maining 7.6% of patients showed other 
abnormal pupil reactions. An exces- 
sively large pupil dilation was shown by 
3.975 to the extent of an increase of 
4096-5096 over the original pupil diam- 
eter. Streltsova states that increases of 
this size never occur in normal subjects. 
'These patients were tense and anxious 
and had confused thought processes. The 
final group of 3.7% of the patients 
showed pupil constriction. This reaction 
is said never to occur in normal subjects 
except when they are in pain or ill. 
Streltsova argues that the low reactivity 
of the majority of her schizophrenics is 
a result of the high level of inhibition 
(she does not explain the high reactivity 
of the small minority of schizophrenics). 

In a second experiment, Streltsova 
(1955) goes on to investigate the extinc- 
tion of the pupil reaction in schizo- 
phrenics using as subjects 50 patients 1n 
whom it had been possible to elicit a re- 
action. In this experiment a bell was 
used as a stimulus and was presented in 
two conditions: continuously, and à 
number of times in short bursts. The 
results showed that the schizophrenics 
fell into two groups. Thirty-four out of 
the 50 patients failed to extinguish the 
pupil reaction, In normal subjects the 
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pupil reaction is extinguished after an 
average of 15 seconds when the stimulus 
is presented continuously, and after 4-25 
presentations when it is presented suc- 
cessively for short intervals, but in the 
schizophrenic patients the reaction had 
not extinguished after 3 minutes in the 
continuous condition or after 50 pres- 
entations in the successive condition. A 
paradoxical result is now reported. 
After this failure of extinction the in- 
tensity of the stimulus was raised con- 
siderably. In normal subjects this pro- 
cedure increases the size of the orienta- 
tion reaction, but in the schizophrenics 
the reaction promptly extinguished and 
remained extinguished for 40 or more 
minutes. 

The second group of 16 patients ex- 
tinguished the orientation reaction as 
quickly or nearly as quickly as normal 
subjects, but these subjects took much 
longer than normal to recover from the 
extinction procedure and it was not pos- 
sible to restore the reaction through the 
presentation of an intense (disinhibit- 
ing) stimulus, as it is in normals. In 
considering her results Streltsova favors 
an explanation in terms of the tendency 
of schizophrenics to generate “protective 
inhibition.” She argues that this ac- 
counts for (a) the sudden appearance of 
inhibition, previously absent, when an 
intense stimulus is presented; and (5) 
the length of time schizophrenics take 
to recover from the effects of extinction 
once their orientation reaction has been 
extinguished. 

In a third experiment Streltsova 
(1955) investigated the effects of caf- 
feine on the orientation reaction in 
schizophrenics. The rationale of this 
investigation springs from the hy- 
pothesis that the absence of reaction 
characteristic of most schizophrenics is 
a result of the strong inhibitory state of 
the nervous system. It is assumed that 
caffeine dissipates inhibition and hence, 


the hypothesis is advanced that if 
schizophrenics are given caffeine their 
orientation reactions should be restored. 
Fifteen patients who had shown the 
most persistent failure to give pupil re- 
actions were taken as subjects. They 
were given doses of .1, .3, .8, and 
1.3 milliliters of caffeine and were tested 
before and at intervals of 15, 30, 45, and 
60 minutes after injection. 

Thirteen of the subjects showed a 
normal orientation reaction or pupil con- 
traction after .1 and .3 milliliter of 
caffeine. Streltsova argues that this sup- 
ports her hypothesis that the schizo- 
phrenics were previously in an inhibited 
state. However, only 3 patients gave 
orientation reactions after doses of .8 
and 1.3 milliliter of caffeine, the other 
12 failing to respond. This failure to re- 
spond with the higher doses is at- 
tributed to protective inhibition. 

In her fourth study Streltsova (1955) 
compared the reactivity of patients early 
in the morning with that obtained later 
in the day. She based this investigation 
on the Pavlovian view that protective 
inhibition accumulates during the day, 
as a result of the stimulation received, 
and is dissipated during sleep. If this is 
so, an implication of the theory that 
schizophrenics are characterized by high 
levels of protective inhibition would 
seem to be that, if patients were tested 
immediately on waking in the morning, 
they would not have time to generate 
protective inhibition to any appreciable 
extent and should react more like nor- 
mals. In this connection Streltsova cites 
an observation by Naumova to the ef- 
fect that catatonic symptoms are less 
marked early in the morning and in- 
crease in the course of the day. Strelt- 
sova tested this theory using the pupil 
dilation measure of reactivity. Twenty- 
two schizophrenics were tested immedi- 
ately on waking in the morning and 
again 2-5 hours later. Twenty-one of 
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the patients responded normally im- 
mediately on waking, or occasionally 
with overreactivity, but later in the day 
they failed to respond. 

An investigation of the characteristics 
of the vascular system in schizophrenics 
has been reported by Stanishevskaya 
(1961). Young simple schizophrenics 
and anxious hallucinated paranoiacs 
gave an unusually great number of spon- 
taneous reactions, which is interpreted 
as indicating a high level of excitation. 
This in turn is due to the weakening of 
the cortical inhibitory control over the 
subcortical areas. In catatonic schizo- 
phrenics and in hallucinated paranoiacs 
who were not anxious, there was very 
little or no spontaneous activity. When 
the patients were stimulated, catatonics 
gave no reactions, hallucinated para- 
noiacs gave normal reactions, and simple 
schizophrenics gave generalized vascular 
reactions but not the local vascular 
pressor reaction which in normal sub- 
jects succeeds the generalized reaction 
when stimuli are presented a number of 
times. 

Several investigators have reported 
that sympathetic tone and reactivity are 
improved in schizophrenics by stimu- 
lants including caffeine (Gamburg, 
1958; Trekina, 1955), cocaine (Eko- 
lova-Bagalei, 1955), and atropin 
(Taranskaya, 1955), 


EEG 


The EEG activity characteristic of 
schizophrenia reported by a number of 
Russian workers includes low frequency 
or absent alpha rhythm, a reduction or 
absence of blocking of the alpha rhythm 
to light and other stimuli, large latencies 
in alpha blocking when it can be ob- 
tained, and the presence of “constella- 
tions" and “overflows” (Belenkaya, 
1960, 1961; Frenkel, 1958; Gavrilova, 
1960; Segal, 1955; Trekina, 1955). In 
Gavrilova’s experiment 10 normal sub- 
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jects were compared with 14 schizo- 
phrenics (5 cases of simple schizo- 
phrenia, 2 catatonic, and 7 paranoids; 
duration of the schizophrenia was 8- 
12 years). The EEGs were recorded 
when the patients were relaxed and after 
the presentation of visual and auditory 
stimuli. When the patients were re- 
laxed, a low frequency alpha rhythm 
was present in the paranoiacs; but in the 
simple and catatonics the frequency was 
below that of the alpha rhythm. No 
reaction to auditory and visual stimuli 
was obtained in the simple and cata- 
tonic patients, but some reaction oc- 
curred in the paranoiacs. With auditory 
stimuli, a low intensity stimulus pro- 
duced increased EEG frequency; but 
when the intensity of the stimulus was 
increased the reaction disappeared. The 
results are interpreted as indicating the 
inhibited state of the cortex in schizo- 
phrenics, especially in the simple and 
catatonic forms and to a lesser extent in 
paranoia. In paranoid patients weak 
stimuli evoke some EEG reaction, but 
strong stimuli increase the inhibition and 
hence no reaction is obtained. 

Gavrilova (1960) also notes the pres- 
ence in her paranoiacs (but not in the 
simple or catatonic patients) of frequent 
constellations, i.e., apparently causeless 
bursts of high amplitude potentials last- 
ing .5-2 seconds from one cortical area 
accompanied by low amplitude poten- 
tials from another, 

These constellations have also been 
reported in acute, tense, and delirious 
schizophrenics by Belenkaya (1960). 
The suggested explanation of these con- 
stellations is as follows. The develop- 
ment of protective inhibition in schizo- 
phrenia attacks both excitatory and 
inhibiting processes. It first weakens the 
process of internal inhibition in the cor- 
tex, thereby upsetting the balance be- 
tween excitatory and inhibitory proc- 
esses and increasing the strength of 
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excitation. Strong excitatory stimuli in- 
crease the protective inhibition and pre- 
vent any reaction, but weak stimuli can 
still *get through" in the less advanced 
(paranoid) forms of schizophrenia. 
When these weak stimuli do get through 
they produce a violent effect. The 
reason for this is that the cortical process 
of internal inhibition is normally con- 
cerned with damping down incoming 
stimuli, and since this process has been 
weakened, the incoming stimuli can no 
longer be contained. 

In paranoid forms of schizophrenia 
the presence of overflows has been re- 
ported; i.e., a burst of activity in one 
cortical area appears to spread and is 
followed by bursts of activity in other 
cortical areas. These overflows rarely 
occur in normal subjects when they are 
awake; but are present in falling asleep 
and during sleep, and also occur in epi- 
leptic cases and in patients with sub- 
cortical tumors. Gavrilova observed that 
external stimuli can induce overflows in 
paranoiacs, and argues that overflows 
are caused by subcortical stimuli acting 
on the cortex and inducing excitation 
which spreads to other areas. She argues 
from the absence of overflows in her 
groups of schizophrenics that subcor- 
tical-cortical relations are impaired. But 
Belenkaya (1960) reported overflows in 
all stages of paranoia, from the initial 
acute delerium to the final “secondary 
catatonic” stage. 

A number of Russian investigators 
have followed the course of EEG 
changes during the administration of 
drugs to schizophrenics. Trekina 
(1955), working with 35 chronic de- 
teriorated schizophrenics, reports ab- 
sence of alpha rhythm, lack of any reac- 
tion to light stimuli, and unsynchronized 
random oscillations which are inter- 
preted as indicating excitation in the 
reticular formation. Moderate doses of 
caffeine brought general improvement 


and restored the alpha rhythm and the 
reaction to light stimuli, and abolished 
or decreased the pathological activity 
from the subcortex. 

A similar experiment by Ekolova- 
Bagalei (1955) reports the EEG activ- 
ity of 85 catatonic patients (aged 17— 
45, duration of illness from a few days 
to several years) before treatment and 
after administration of cocaine. The co- 
caine improved the patients’ behavior, 
so that in the majority of cases they be- 
gan to move to instructions, speak, and 
negativism and waxy flexibility disap- 
peared. (In eight cases of very long 
standing schizophrenia, no improvement 
was obtained even with increased and 
extended dosage.) At the same time the 
cocaine had the effect of increasing the 
alpha frequency. In a small number of 
cases, however, large doses of cocaine 
made the patients worse than before. 
The explanation of the findings is that 
small doses of cocaine reduce the amount 
of cortical inhibition, but larger doses 
induce protective inhibition, which fur- 
ther intensifies the inhibited state of the 
cortex. It was noted that cocaine acted 
first by increasing alpha frequency, then 
by increasing sympathetic tone, and 
finally by bettering the patient’s volun- 
tary behavior. It is argued that this 
indicates that cocaine acts first on the 
cortex and that an effective attack on 
schizophrenia can be made by restoring 
the cortical excitatory processes. 

Two papers of Belenkaya (1960, 
1961) report the effects of chlor- 
promazine and the stimulant meratran 
on the EEG activity of paranoid schizo- 
phrenics. It is argued that there are 
typically four successive stages in the 
evolution of paranoid schizophrenia: 
first, paranoid delerium without hallu- 
cinations; secondly, with hallucinations; 
thirdly, paraphrenic delerium; and 
fourthly, a state of secondary catatonia 
with hallucinations. Forty patients were 
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divided into the four groups, and with- 
out drugs the increasingly advanced 
stages showed decreasing EEG activity 
and lesser reactivity to light (in these 
experimental conditions normal subjects 
gave a reaction to light on all testings 
compared with 24% of the first group 
of patients and only 5% of the last 
group). All the groups showed over- 
flows. Both meratran and chlorpro- 
mazine had a beneficial effect on the first 
group and to some extent on the second, 
improving their general condition and 
making their EEG records more normal. 
With chlorpromazine, however, there 
was a delayed effect. For 2 or 3 weeks 
the pathological features of the EEG 
increased, especially the number of over- 
flows. After this time the EEG became 
normal. Belenkaya explains these re- 
sults in the following way. Chlorpro- 
mazine depresses the excitation in the 
reticular formation, and during the 2- or 
3-week period this depression of reticu- 
lar excitation allows the internal in- 
hibitory processes to be restored. The 
increased presence of overflows is a sign 
of the increasing restoration of the in- 
hibitory processes (assuming the equiv- 
alence of inhibition, sleep, and the oc- 
currence of overflows). In the next 
stage, the reticular formation recovers 
from the effects of chlorpromazine and 
exerts a normal excitatory effect on the 
cortex, controlled by the restored in- 
hibitory processes. Patients with the 
severer forms of paranoia did not bene- 
fit from either drug and showed para- 
doxical reactions to them, viz., after 
meratran EEG activity decreased and 
after chlorpromazine it increased, 
Fedorovsky (1955) compared the 
EEG activity of normal subjects and 36 
schizophrenics (hallucinated paranoiacs 
and simple) during sleep therapy. He 
found that the schizophrenic records 
tended to show slow alpha rhythms when 
they were awake, but that during sleep 
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they showed lower amplitude slow waves 
than normals. It is suggested that this 
indicates that schizophrenics sleep less 
deeply than normal people. A similar 
result is reported with catatonic patients 
by Popov (1955). 


CONDITIONING 


Both autonomic and motor condition- 
ing techniques have been used on schizo- 
phrenics by Russian investigators 
(Dobrzhanskaya, 1955;  Kostandov, 
1955; Saarma, 1955; Sinkevitch, 1955; 
Vertogradova, 1955). 

All investigators have found that con- 
ditioning is poor or cannot be obtained 
in schizophrenics. An attempt to con- 
dition vascular reactions has been re- 
ported by Vertogradova (1955), work- 
ing with 30 schizophrenics (mainly 
paranoid and simple, duration of illness 
1 month to 3 years). Unconditioned 
vascular reactions to heat and cold were 
less than normal. A light was used as a 
conditioned stimulus and conditioned 
vascular reactions were typically ac- 
quired after 2-18 pairings of the light 
with the unconditioned stimuli. How- 
ever, the conditioning was not stable; 
ie. frequently the presentation of the 
conditioned stimulus did not elicit a re- 
sponse, and generally the conditioned 
response had disappeared on subsequent 
days. Firm conditioned responses could 
not be acquired with up to 100 pairings. 
It was also observed that in a number 
of cases the presentation of the light 
during the conditioning procedure in- 
hibited the vascular reaction, so that it 
was either reduced or disappeared en- 
tirely. Thirteen patients were retested 
after a course of insulin, and in the 10 
of these who improved, the vascular re- 
actions became stronger and the inhibit- 
ing effect of the light during condition- 
ing disappeared. In the other three 
patients the lack of behavioral improve- 
ment was accompanied by a correspond- 


RUSSIAN RESEARCH ON SCHIZOPHRENIA 493 


ing absence of increase in the vascular 
reactions. 

A somewhat similar experiment has 
been reported by Trekina (1955) on 35 
chronic deteriorated schizophrenics sub- 
ject to excited outbursts. Plethysmo- 
graph recordings were made of the re- 
actions to the unconditioned stimuli of 
cold, heat, pain, light, and touch. Some 
vascular reactivity was present, and it is 
argued that this indicates that excit- 
atory processes are present in the sub- 
cortex. It was found that on the second 
or third day of the investigation the un- 
conditioned vascular reactions extin- 
guished much more quickly than is 
normal, i.e., after 4-6 presentations. 
This is taken as evidence for the strong 
cortical inhibitory processes in schiz- 
ophrenics. Attempts were then made to 
condition the vascular reactions to light 
and to verbal stimuli, but the condition- 
ing was very slow and the conditioned 
reactions, once established, were very 
unsteady and kept disappearing and re- 
appearing. This again is taken as evi- 
dence for the strong inhibitory processes 
in the cortex. 

A number of experiments use some 
variety of motor conditioning in which 
the subject is instructed to give a re- 
sponse to a certain stimulus (e.g., press- 
ing a buzzer to a light) and is given 
verbal reinforcements when the response 
is made correctly. This conditioning 
procedure is then made more elaborate 
by investigating the discrimination of 
the stimulus from similar stimuli, extinc- 
tion of the response through nonrein- 
forcement, effects of extraneous stimula- 
tion, and developments of conditioned 
disinhibition. The findings most com- 
monly reported using these techniques 
on schizophrenics are as follows: 

i. The speed of conditioning is im- 
paired in all schizophrenics, but more in 
catatonic patients than in paranoiacs. 
Several investigators have found it im- 


possible to condition catatonics whereas 
the conditioning of paranoiacs is pos- 
sible, but slow. "Typically 3-5 trials are 
required to condition normal subjects 
and 15-20 trials in schizophrenics. 

2. The associations are very unstable, 
but can be stabilized with a very large 
number of reinforcements (100 or 
more). 

3. The conditioned reaction is very 
easily inhibited by extraneous stimuli 
ie. changes in laboratory conditions, 
etc., even though these are quite slight. 

4. There is a great variability in re- 
sponse latency. 

5. Discriminations are very difficult 
for schizophrenics to make. Many in- 
vestigators found only a minority of 
patients would make correct discrimina- 
tions, 

6. Improvements in behavior follow- 
ing treatment are paralleled by improve- 
ments in conditionability. 


The impairment of schizophrenics in 
this type of conditioning is generally in- 
terpreted as reflecting the inhibited state 
of the cerebral cortex. The instability of 
the conditioned reactions, inhibition by 
extraneous stimuli and variability in re- 
sponse latencies are regarded as due to 
the strengthening of cortical inhibition 
through negative induction from the sub- 
cortex. Saarma (1955) reports two 
further findings consistent with this 
explanation. When discrimination is at- 
tempted the schizophrenic frequently 
ceases to respond at all. It is inferred 
from this that inhibition has readily be- 
come attached to the stimulus to be dis- 
criminated and has spread to the original 
stimulus. Secondly, when reversal shifts 
(i.e. the positive stimulus is changed to 
a negative and the negative stimulus to 
positive) are attempted on schizophren- 
ics, the positive stimulus can easily be 
changed to negative, but in many cases 
it was impossible to change the negative 
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stimulus to positive. A similar finding 
is reported by Sinkevich (1955). 

A possibly unexpected finding from 
this point of view is that two groups of 
paranoid schizophrenics, while showing 
the typical features of poor conditioning 
listed above, take a large number of non- 
reinforced trials before the conditioned 
response is extinguished. This finding 
has been reported by Dobrzhanskaya 
(1955) and Kostandov (1955). The ex- 
planation advanced by both authors is 
that the slow conditioning of schizo- 
phrenics is due to the protective inhibi- 
tion, Extinction and discrimination are 
brought about by the accumulation of 
internal inhibition and the process of 
generating internal inhibition has itself 
been weakened by the protective inhibi- 
tion. Hence, the slow extinction and 
discrimination characteristic of schiz- 
ophrenia. 

The effect of drugs on motor condi- 
tioning has been reported by Taranskaya 
(1955). Moderate doses of the stimu- 
lant atropin improved schizophrenics’ 
performance on a motor conditioning 
task, increasing speed of conditioning, 
the stability of the response, and reduc- 
ing the latency. At the same time hal- 
lucinations disappeared. Larger doses 
produced less beneficial effect on condi- 
tioning and increased hallucinations. 
The inhibitory drug phenamin further 
impaired conditionability and increased 
hallucinations. 


Worp ASSOCIATION ‘TESTS 


Apart from conditioning, word as- 
sociation tests are sometimes used in 
Russian research on schizophrenia. A 
typical experiment is that of Doku- 
chaeva (1955). The method consists in 
presenting a stimulus word, to which the 
patient has to give a response. The re- 
sponse is scored both for the time taken 
to give it and for its adequacy; €g., 
repetitions of the word, etc., are scored 


as inadequate. The general findings are 
that schizophrenics are slow and give 
inadequate responses. In Dokuchaeva's 
experiment 60 schizophrenics were given 
the word association test before and 
after varying doses of caffeine. ]t was 
found that moderate doses of caffeine 
increased the speed of the response and 
improved the quality of the associations. 
This improvement appeared to depend 
on the reactivation of the sympathetic 
system, since in cases where caffeine had 
no sympathetic effect the associative re- 
actions remained unchanged. With large 
doses of caffeine the speed of reaction 
became slower and the adequacy of the 
responses deteriorated. 


TREATMENT 


As a logical implication of his theory 
of schizophrenia, Pavlov recommended 
and experimented with prolonged sleep 
as a therapeutic measure. The rationale 
of this procedure was that it strength- 
ened the protective inhibition and al- 
lowed the cortical cells to recover from 
their exhausted state. The most common 
drugs used to induce prolonged sleep in 
the /30s were Cloetta’s mixture and 
sodium amytal, and many Russian 
psychiatrists have been impressed with 
the success of these treatments. On the 
other hand, stimulants have also been 
used for therapeutic purposes, some of 
the most common being insulin, COn- 
vulsive therapy, and caffeine. Ivanov- 
Smolenskii (1954) considers that the 
best results are given by combined 
sleep and stimulant treatment and gives 
the explanation that sleep allows the 
cortex to recover, while stimulants re- 
activate the sympathetic system. Com- 
bined administration of bromide with 
caffeine has also been used successfully 
in the treatment of dogs with exper" 
mental neurosis (Ivanov-Smolenskil, 
1954). à 

In the past few years chlorpromazine 
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has been widely used in the treatment 
of schizophrenia and a large number of 
experiments have been published on its 
site and mode of action. It is generally 
agreed that chlorpromazine has an inhib- 
iting effect on both conditioned and un- 
conditioned reflexes; according to Sav- 
chuk (1960) dogs and rabbits given 1 
milligram per kilogram chlorpromazine 
give weaker salivatory reactions, but 4 
to 5 days after administration these in- 
crease. There is no agreement among 
Russian investigators about whether 
chlorpromazine acts first on the cerebral 
cortex or on the reticular formation, but 
it is generally held that both are de- 
pressed by large doses. 

A theory to account for the therapeu- 
tic effects of chlorpromazine on catatonic 
schizophrenia has been put forward by 
Zurabashvili (1960). He assumes that 
this variety of schizophrenia is initially 
induced by some toxic substance in the 
blood, citing as evidence in favor of this 
view experiments showing that the in- 
jection of catatonics' blood into dogs has 
an impairing effect. In these experi- 
ments, dogs were trained on a discrimi- 
nation task; after injection with 
catatonics’ blood, the learned discrimina- 
tion broke down and the dogs reacted to 
the negative as well as to the positive 
stimulus (this breakdown was not ob- 
tained after injection with the blood of 
normal adults). However, if before in- 
jection with catatonics' blood the dogs 
were injected with chlorpromazine, the 
impairing effect of the catatonics’ blood 
was counteracted. Since it is known that 
chlorpromazine has a depressant effect 
on the reticular formation, he infers 
from this that the toxic substance in 
schizophrenia has an excitatory effect on 
the reticular formation. During the de- 
velopment of schizophrenia the toxic 
agent raises the level of reticular excita- 
tion, which in turn stimulates the 
thalamic areas and the cerebral cortex. 
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This stimulation becomes “above 
strength" and induces protective inhibi- 
tion in both the cortex and the thalamic 
areas, and hence, the signs of inhibition 
in the cortex (low frequency EEG activ- 
ity and slow conditioning) and the low 
sympathetic tone and reactivity. The 
effect of chlorpromazine is to depress 
the reticular excitation; reticular stimu- 
lation of the thalamic region and the 
cortex is thereby reduced and the pro- 
tective inhibition dissipates. As a result 
of this, there is clinical improvement and 
the sympathetic system becomes more 
active. In support of the last point, 
Zurabashvili cites evidence that in cata- 
tonics receiving chlorpromazine there is 
a fall in the number of erythrocytes and 
increases in hemoglobins, sweating, and 
pulse rate. This theory is a departure 
from the commonly held Pavlovian view 
that protective inhibition in the cortex 
is the primary cause of breakdown in 
schizophrenia, and is an indication that 
Russian workers in this field are not 
necessarily fettered by strict observance 
of Pavlov's original theory. 


CONCLUSION 


At an empirical level, Russian work 
on schizophrenia at some points coin- 
cides with that in the West; and at 
others, breaks entirely new ground. 
From this point of view the principal 
findings can be summarized as follows: 

1. The Russian evidence as a whole 
indicates that there are two types of 
schizophrenics: a majority group char- 
acterized by low sympathetic tone and 
reactivity; and a minority group, in 
whom sympathetic tone and reactivity 
are unusually high. The majority group 
would appear to consist mainly of cases 
of catatonic and simple schizophrenia 
and the minority group mainly of acute 
and agitated patients, especially para- 
noiacs. In this respect, Russian in- 
vestigators seem to have arrived in- 
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dependently at similar conclusions to 
those of several workers in the West who 
have reported two groups of schizo- 
phrenics characterized by high and low 
sympathetic reactivity (Gellhorn, 1958), 
high and low arousal (Venables, 1960), 
and high and low anxiety (Mednick, 
1958). 

2. The Russian findings on the EEG 
of schizophrenics indicate patterns char- 
acteristic of low arousal and drowsy 
states. There are a small number of 
similar Western findings although many 
investigators have failed to find EEG 
differences between schizophrenics and 
normals (Brackbill, 1956). This is pos- 
sibly because Western investigators have 
not looked for the overflows and con- 
stellations described in the Russian lit- 
erature. 

3. Behaviorally, the Russian work 
showing slow reactions in conditioning 
experiments is to some extent paralleled 
by the findings of Eysenck and his as- 
sociates (Eysenck, 1952; Payne & Hew- 
lett, 1960) of slowness in schizophren- 
ics over a variety of tasks. In general, 
however, Russian conditioning tech- 
niques have been very little used in 
Western research on schizophrenia. 

4. Alarge number of Russian experi- 
ments show that schizophrenics are 
unusually sensitive to stimulants, being 
improved by small quantities but im- 
paired by large. If these results are in- 
terpreted as being due to the effects of 
stimulants on arousal, they accord quite 
well with the somewhat similar results 
of Venables and Tizard (1956) and Ven- 
ables (1960), which showed that schiz- 
ophrenics are unusually affected by 
intense stimuli and by the level of back- 
ground stimulation. In this respect, the 
Russian findings support the hypothesis 

advanced by Venables (1960) that schiz- 
ophrenics can only operate efficiently 
within a narrower range of arousal than 
normals. 
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At a theoretical level, it will be evi- 
dent that Russian work on schizophrenia 
is strongly tied to Pavlov's theory and 
is largely concerned with demonstrating 
in detail the truth of Pavlov's hypothe- 
ses or of making modifications to explain 
discrepant findings within the Pavlovian 
framework. It would appear that the 
recent experimental results can be made 
to fit reasonably well with Pavlovian 
theory and the theoretical significance 
of the Russian work on schizophrenia 
depends on the acceptability of general 
Pavlovian theory, especially on the con- 
cepts of protective inhibition and induc- 
tion. A critical appraisal of Pavlovian 
theory as a whole would demand a long 
essay and would be out of place in this 
review. But it is apparent that, at any 
rate in the eyes of Russian investigators, 
Pavlovian theory has withstood the 
attacks of its critics. 
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REACTION TO A PLACEBO: 


THE MEDIATIONAL DEFICIENCY HYPOTHESIS 


JAMES YOUNISS anv HANS G. FURTH 
Catholic University of America 


This paper reconsidered data previously interpreted as supporting the 
mediational deficiency hypothesis. It was concluded that: certain com- 
parisons of theoretical similarities were unclear; inferences from “rever- 
sal-nonreversal” and “transposition” paradigms did not lead to un- 
equivocal interpretations; and crucial evidence from research in deafness 
and from other theoretical positions, as Piaget’s, were ignored. The re- 
duction of the role of language in cognitive development to a verbal 
mediation construct was considered an oversimplification and a potential 


source of confusion. 


In a recent article Reese (1962) has 
presented a number of studies in sup- 
port of the hypothesis that verbal media- 
tion is a function of age level. While 
most students of behavior would not 
disagree with the general notion that at 
some stage in their development chil- 
dren perform significantly better on 
some tasks than their counterparts at a 
younger age level, the use of a verbal 
mediation concept to explain this differ- 
ence is, it seems to us, dubious and per- 
haps harmful. It would seem meaning- 
ful for further research to comment on 
certain ambiguities in Reese’s position 
and to question the comparison among 
some of the studies presented in support 
of his hypothesis “that there is a stage 
in development in which verbal re- 
sponses do not serve as mediators,” 
called the “mediational deficiency hy- 
pothesis.” 

Reese defined the mediating response 
in normal subjects, at least, as “an ori- 
enting response, usually verbally di- 
rected, which involves identification of 
and reaction to the appropriate dimen- 
sion” (cues) and as “verbal labels . . . 
which involve” cue specification and 
overt reaction. The ambiguity does not 
rest so much in the use of terms, as 
many operational definitions were pre- 
sented, but in the misleading statement 


that Luria’s position on control of be- 
havior through language is similar to 
that of Reese, For Luria (1960) the 
nominative role of language precedes 
the stage at which internal speech, as 
meaning, can control behavior. For ex- 
ample, Luria presents evidence that in 
children 1-274 years old, age levels at 
which Reese’s mediation deficiency hy- 
pothesis is usually confirmed, labeling 
can facilitate discrimination. 

Another ambiguity in Reese’s posi- 
tion stems from the way in which he 
invokes the concept or construct of 
mediational deficiency to explain, ad hoc, 
various data, If on the same or similar 
tasks two groups of subjects, one 
younger than the other, performed dif- 
ferently with the older subjects per- 
forming in agreement with mediational 
predictions, then he concluded “the re- 
sults may be interpreted as supporting 
the mediation deficiency hypothesis.” 
As a word of caution it should be noted 
that younger children may differ from 
older on a number of dimensions (Wohl- 
will, 1962), verbal behavior being one, 
and unfortunately one we know little 
about. 

Just as precarious is the circular rea- 
soning that some tasks did not really 
require verbal mediation—after it was 
shown that younger and older subjects 
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found these tasks of equal difficulty. 
For example, Reese suggests that “ac- 
quired distinctiveness of cues does not 
involve mediation, according to Dollard 
and Miller's (1950) interpretation" (it 
does, however, according to Goss [1961] 
and Lacey [1961]). 

We would also take issue with Reese's 
interpretation of the data from reversal- 
nonreversal shift (R-NR) studies and 
his partial review of transposition in- 
vestigations. Kendler and  Kendler 
(1959) found that children in kinder- 
garten, performed R and NR with equal 
difficulty. When divided into slow and 
fast learners according to training trials, 
the former group found NR easier and 
the latter, R easier. While this might be 
supportive evidence for the hypothesis, 
verbal mediation theory would have to 
explain why the faster learners were the 
mediators—in fact the evidence is usu- 
ally opposite, that is, the fast learners 
are usually the low IQ, less hypothesis 
group (Osler & Fivel, 1961). Further- 
more one would have to explain why the 
control group who continued to respond 
to the same relevant cues under no 
change in reinforcement were not dif- 
ferent from the NR group. In the study 
where younger subjects, in nursery 
school (Kendler, Kendler, & Wells, 
1960), found NR easier than R, the 
procedure was radically changed so that 
comparability in interpretation is ques- 
tionable. For example, in the 1959 
study the. R and NR groups were not 
presented with a choice of dimensions 
during transfer trials as they were dur- 
ing training, while in the latter work no 
choice was presented during training 
and on transfer two dimensions were 
varied, but not the previously relevant 
dimension for the NR group, which of 
course made the task easier. One gets 
the notion that not even Reese is en- 
tirely clear in interpreting R-NR re- 
sults for he concludes that O'Connor 


and Hermelin's (1959) investigation 
supports the mediational deficiency hy- 
pothesis. O'Connor and Hermelin found 
that retardates performed R with fewer 
errors (trials) than normal subjects, 
mean age —5.1 years, and Reese had 
to assume that for retardates the “ver- 
bal response [is] functionally equiva- 
lent to a nonsense name." It might be 
added that House and Zeaman (1962), 
in a study too recent to be included in 
Reese's reviews, found that retardates 
were found to perform R in fewer trials 
than NR—even when distinctiveness of 
cues through nonsense labeling could 
not have helped, since specific cues were 
changed within dimensions on the shift 
task. 

In his partial review cf the transposi- 
tion studies Reese acknowledges that 
transposition effects can be accounted 
for by explanations other than verbal 
mediation, for example, discrimination 
learning set. For example, Hunter 
(1952) found transposition in subjects 
who were judged not to have the cor- 
rect concept, a result not predicted by 
Reese's hypothesis. The author sug- 
gested that a discrimination learning 
set, ^less dependent on age level than 
verbal mediation is," may account for 
Hunter's results and “therefore may not 
be relevant to the present topic." 

An adequate analysis of transposition 
studies cannot be offered in the present 
paper but a fair evaluation might be, as 
Rudell (1958) stated, “that the rela- 
tional response in transposition is not 
reducible to responses to components 
according to current S-R theory, nor J5 
it dependent, in any presently pre 
dictable manner, upon level of verbal 
ability." 

While Reese has reviewed a com- 
mendable number of studies, we might 
point out that a small but interesting 
series concerning deaf subjects have 
been meaningfully considered. Although 
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our knowledge of the deaf is admittedly 
sketchy, they surely would fit an opera- 
tional definition of a group deficient not 
only in verbal mediation but the labels 
themselves. 

It should be of concern to proponents 
of a verbal mediation theory that more 
recent and better controlled investiga- 
tions of deaf children's cognitive behav- 
ior fail to show differences which the 
theory would predict. On transposition 
(Oléron, 1957), classification (Kates, 
1961), conceptual discovery, and con- 
trol (Furth, 1961), deaf children at vari- 
ous age levels performed very much like 
their hearing controls. We believe that 
such convergent findings of unexpected 
results are of greater import in weaken- 
ing an oversimplified theory than an 
admitted number of studies in which 
deaf in comparison to hearing children 
were found to be inferior (Furth, in 
press; Oléron, 1961). The differential 
impact of early language deficiency may 
be of a more indirect nature, that is, 
perceptual dominance of Oléron or di- 
minished general experience of Furth. 
In that case theorizing about a defi- 
ciency in labeling sounds unconvincing. 

Any attempt to explain a large num- 
ber of studies under a relatively new 
concept, such as recent verbal media- 
tion theory, is a difficult task. This is 
evident from Reese's conclusions that 
“The studies reviewed indicate that the 
critical age for the occurrence of media- 
tion may be different for different ex- 
perimental situations," may or may not 
be related to awareness, might or might 
not be a function of level of establish- 
ment of the concept, and could or could 
not be voluntary. As a general conclu- 
sion we might add that Reese has 
pointed out a number of important and 
promising areas for further research. 
But these are not new questions nor are 
they peculiarly related to the mediation 
deficiency hypothesis. 
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Perhaps the most serious objection to 
Reese's position is that he has ignored 
the work of Piaget, Werner, Vygotsky, 
and even Luria—whose experimental 
programs have dealt directly with the 
relationship of language to development. 
In light of their work, we would con- 
clude that mediational deficiency hy- 
pothesis is an oversimplification of a 
complex issue, whose solution has been 
shown to be resistent to simple explana- 
tions. 
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A REPLY TO YOUNISS AND FURTH 


HAYNE W. REESE 
State University of New York at Buffalo* 


(a) Youniss and Furth err in referring to the "mediational deficiency 
hypothesis" (MDH) as explanatory; it is only descriptive. (b) They 
misinterpret Dollard and Miller's explanation of the acquired distinctive- 
ness of cues (ADC). Mediation is not involved in ADC because the 
response-produced cues become part of stimulus complexes, and do not 
intervene between stimuli and responses. (c) Further evidence is needed 
to determine the generality of the MDH and the causes of the deficiency. 


Youniss and Furth (1963) raise three 
objections which are not adequately 
answered by reference to my original 
article (Reese, 1962). One concerns the 
nature of the *mediational deficiency hy- 
pothesis," which they refer to as ex- 
planatory. The hypothesis is not ex- 
planatory; it only describes a particular 
age difference. It is a statement of 
inferences about mediation, based on 
the finding in certain experimental sit- 
uations that older children perform as 
would be predicted if mediation occurred 
and younger children tend to perform 
as would be predicted if mediation did 
not occur. Kendler and Kendler’s 
(1959) conclusion was based on similar 
inferences about mediation in fast and 
slow learners. Such inferences are re- 
quired because mediating responses are 
usually not directly observable. In con- 
nection with Youniss and Furth's (1963) 
objection to Kendler and Kendler's 
(1959) inferences, Osler and Fivel 
(1961) actually found that the high IQ 
group included more fast learners than 
the lower IQ group, and Osler and Fivel 
concluded that this finding was “en- 
tirely consistent with recent findings by 
Kendler and Kendler (1959) and Kend- 
ler, Kendler, and Wells (1960) [p. 7].” 

A second objection concerns Dollard 
and Miller’s (1950) theoretical ex- 
planation of the acquired distinctive- 
ness of cues. Dollard and Miller as- 
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sumed that if the subject is required to 
learn to respond to one of two stimulus 
complexes and to inhibit response to the 
other, the learning is facilitated by “ac- 
quired-distinctiveness” pretraining be- 
cause of the addition of a different 
response-produced cue to each stimulus 
complex, which makes the stimulus com- 
plexes more distinctively different than 
they would be without the additional 
cues. Since the response-produced cue is 
a part of the stimulus complex, it can- 
not intervene between the stimulus com- 
plex and the terminal response, and 
mediation in the currently accepted 
sense (see Goss, 1961) is not involved. 
In an earlier statement of the theory, 
Miller (1948) said, 
learning to respond with highly distinctive 
names to similar stimulus situations should 
tend to Jessen the generalization of other 
responses from one of these situations to 
another since the stimuli produced by respond- 
ing with the distinctive name will tend to 
increase the differences in the stimulus pat- 
terns of the two situations [p. 174]. 


It is significant that mediation was not 
mentioned, although Miller used the 
word in describing acquired equivalence 
of cues (also on p. 174). 

Arnoult (1957) has reviewed studies 
which show that acquired-distinctive- 
ness pretraining does not improve later 
psychophysical discrimination, contrary 
to implications of this theory. The va- 
lidity of the theory is irrelevant to the 
present discussion, but it might be noted 
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that Goss's (1961) discussion of ac- 
quired distinctiveness implies that me- 
diation is required (Goss does not 
attribute this notion to Dollard and 
Miller) and psychophysical discrimina- 
tions would not be affected. 

Regarding the third objection which 
requires comment, my review shows that 
the occurrence of a deficiency in media- 
tion in certain experimental situations 
needs no further documentation, al- 
though further research is obviously 
needed to determine (a) in which addi- 
tional situations mediation fails to 
occur, and (5) the causes of the me- 
diational deficiency. (Reese [1963a, 
1963b; Reese & Ford, 1962] has dem- 
onstrated mediational deficiency in a 
perceptual set situation, and suggested 
alternative causes of the deficiency. 
Kendler [1961] has also outlined al- 
ternative theoretical explanations of the 
deficiency.) It may be, as Youniss and 
Furth (1963) seem to suggest, that re- 
views of the research using deaf subjects 
and of the work of Piaget, Werner, 
Vigotsky, and Luria are relevant to these 
two needs, but the question of their 
relevance is empirical and can be an- 
swered best by the writing of further 
reviews. As noted in the first para- 
graph, not all age differences indicate 
mediational deficiency in the younger 
subjects; the situation must be one in 
which mediation can be inferred in the 
older subjects. 
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CLARIFICATION OF BARTLEY'S MODEL: 
\ REPLY TO THROSBY * 


S. HOWARD BARTLEY 
Michigan State University 


A statement of Bartley’s model which indicates that more than 1 pulse 
to cycle fraction produces critical flicker frequency for a single combina- 
tion of cycle length and intensity. The phenomenon itself is contrary 
to long-held belief regarding conditions for just eliminating flicker pro- 
duced by a train of photic pulses. The model supposes that off responses 
from the retina signal termination in stimulation and that since these 
occur only after inputs reach certain durations, extending duration 
beyond the value first producing fusion reintroduces flicker, and still 
further extending input duration again produces fusion through the inhi- 
bition of the off response by the close succession of the next photic pulse. 


Throsby (1962) has discussed the 
factor of pulse to cycle fraction (PCF) 
in the production of critical flicker fre- 
quency (CFF), the intermittency rate 
just producing steady brightness. The 
PCF, in many places, has been called 
the light:dark ratio. Throsby symbol- 
izes it by Pr. 

There are three methods or sets of 
operations that can be used to obtain 
information regarding intermittent stim- 
ulation. Method 1 involves increasing 
intensity, while holding PCF and target 
area constant. (This method produced 
the Ferry-Porter law.) By this method, 
a family of curves can be produced, each 
curve representing a different fixed PCF 
and/or a different fixed target area. 

Method 2 involves increasing PCF, 
while holding intensity and area con- 
stant. (This produced Talbot’s law.) 
Owing to apparatus limitations, PCF 
was generally varied in stepwise fashion. 
A family of curves, each curve repre- 
senting a CFF-PCF relation for a dif- 
ferent intensity and/or area can be ob- 
tained by this method. 

Method 3 involves increasing PCF, 
while holding intermittency cycle length, 
intensity, and area constant. This 


1 Work on the problem has been supported 
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method differs from the other two since 
it does not provide CFFs but rather in- 
dicates PCF values involved at transi- 
tions between flicker and fusion, or fu- 
sion and flicker. (This method has pro- 
vided the direct test of the Bartley 
model.) By it a number of sets of data 
can be obtained, each set for a different 
intensity level, and/or a different length 
of intermittency cycle. 

Ever since the days of Talbot in the 
first half of the nineteenth century, it 
has been assumed that once intermit- 
tency rate is raised to the point of pro- 
ducing a steady brightness for a given 
combination of PCF, intensity, and 
target area, increasing PCF will not re- 
introduce flicker, but will only raise 
brightness. Some years ago, some doubt 
was cast on this conclusion. 

Bartley (1937), using Method 1, 
noticed that several curves in his family 
of PCF curves crossed each other. He 
verified this finding in many experi- 
ments since and he (Bartley, 1958) in- 
terpreted it as meaning that more than 
one PCF would yield the same CFF 
when target intensity and area are held 
constant as shown in Figure 1. He had 
already pointed out that very different 
PCFs could produce CFF for almost the 
same cycle lengths (Bartley, 1951). 
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Thus the conventional expectation stated 
in italics above could not be correct. 
Bartley and Nelson (1961) studied 
more extensively the shapes of CFF and 
PCF curves. Again, by Method 1, it 
was shown that certain PCF curves, 
instead of forming a family of some- 
what parallel members, intersected each 
other. They also produced CFF curves 
somewhat similar to those produced by 
such workers as Ives (1922), but these 
curves contained certain irregularities 
which were interpreted as additional 
evidence of the equivalence of multiple 
PCFs in producing transitions between 
fusion and flicker. Bartley (1958; Bart- 
ley & Nelson, 1960) accounted for the 
crossing of PCF curves by resorting to 
known facts of neurophysiology. It is 
known that all ganglion-cell discharges 
do not bear the same quantitative rela- 
tions to the stimulus input. Some cells 
discharge only following the termination 
of stimulus input (Hartline, 1938). 
Bartley and Bishop (1942) had shown 
that “on” and “off” responses require 
different photic pulse durations to elicit 
them, Only when the photic pulse ex- 
ceeds a certain duration is a sizable off 
response elicited. Thus it requires a 
longer pulse to produce a sizable off 
response than to produce an on response. 
It is supposed that in producing flicker 
and fusion the cortex uses the on and 
off responses in different ways. It is 
known that these two sorts of discharge 
are preserved with considerable sepa- 
rateness all the way to the cortex (Bart- 
ley, 1936). 'The CFF is not supposed to 
be simply an indicator of the stimulus 
rate at which on and off responses, com- 
ing at different instants, give some sort 
of temporal continuity of input to the 
cortex, although continuity provided by 
some means is a factor. Off responses, 
since they emerge only after photic 
pulses have reached a sizable duration, 
probably signal the termination (or 


diminution) of stimuli that have reached 
a certain critical duration. We know 
that in sensation, some flashes we see 
are too short to be experienced as onsets, 
continuations, and terminations, but 
that these features can be experienced 
when stimuli are long enough. It seems 
that the off response is the input feature 
that provides the definite experience of 
termination in such cases. 

Bartley (1936) supposed that when 
PCF is low, the member pulses are too 
short to produce off responses in the 
optic nerve, but as PCF is increased, off 
responses come to be produced. He sup- 
posed that increasing PCF already pro- 
ducing fusion would finally reintroduce 
flicker since, in general, off responses 
supposedly signal either termination or 
diminution of input. 

Since still higher PCFs finally shorten 
the null period to the point of virtual 
nonexistence, very high PCFs must pro- 
duce fusion, But an additional well- 
known fact in regard to off responses is 
relevant here. Granit and Therman 
(1935) showed that the off response in 
the electroretinogram can be inhibited 
if a photic pulse follows its predecessor 
after a very short interval. Applying 
this to the present situation, it would be 
expected that as PCF is increased, the 
diminishing null interval in the intermit- 
tency cycle would become so short that 
off-responses would be inhibited. Then 
flicker would disappear before the null 
periods would become so short as to be 
ineffective as gaps in illumination. 

This, in essence, is the Bartley model, 
and it was directly tested (Bartley & 
Nelson, 1961) by an apparatus by which 
cycle length and intensity were constant 
and PCF varied (Method 3). Begin- 
ning with PCF producing fusion (steady 
brightness), PCF was increased and pro- 
duced flicker and further increased an 
reinstated fusion. A PCF range from 
1/30 to 29/30 was used. This finding 
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confirmed the expectations already 
stated; namely, that more than one PCF 
produces a transition between flicker 
and fusion as cycle length is held 
constant. 

Nevertheless, Throsby has criticized 
the Bartley model as if it were exclu- 
sively a conceptual scheme, In so doing, 
she missed its main feature. Her argu- 
ment seems to be that a CFF curve 
indicates the transition between flicker 
and fusion at all points along its course. 
This is true but irrelevant in the way 
she uses it. She seems not to be clear 
about the details of the three methods 
of experimentation already described 
and so fails to see that the horizontal 
line in her Figure 3 (our Figure 1), 
which labels constant cycle time, is not 
a CFF curve; thus her statement about 
the meaning of Points A, B, and C is ir- 
relevant. The relevant fact is that the 
line represents the chosen cycle time and 
intersects a CFF curve at these points. 
The region of the line from A to B says 
that if PCF in this region were used, 
fusion would result, and if the region 
from B to C were used, flicker would 
result, and finally, if PCFs in the region 
beyond C were used, fusion would be 
reinstated. Lest it be misleading to have 
talked in terms of PCF rather than 
photic pulse duration, let it be under- 
stood that PCF always implies a known 
pulse length for a given cycle length 
(“cycle time"). 

Were a longer cycle time chosen (Line 
L in Figure 1), it would fall below the 
CFF curve for most of its course and 
then lie above it for only the final part. 
Thus under these conditions there would 
be only one transition between flicker 
and fusion. Thus it is to be expected 
that a cycle length can be so long as to 
produce flicker for most of its course, 
and fusion only when the pulse fills al- 
most all of it. This is the conventional 
expectation. Therefore it can be said 
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that not all conditions produce the 
series of flicker-fusion transitions. The 
model does tell one this and enables one 
to determine the conditions for both sets 
of results. 

Throsby’s statement, that we have 
not shown that the very shortest pulses 
(low PCFs) always produce flicker, is 
not the devastating criticism she believes 
it to be. Whether this is true or untrue 
in no way nullifies the findings already 
obtained by the fixed-cycle length 
method (Method 3). 

The central issue involved in the 
Bartley model is whether or not when a 
given combination of cycle, intensity, 
and PCF just produces steady bright- 
ness, a lengthening of PCF will never do 
anything more than simply increase 
brightness. The literature up to the pro- 
duction of the model never gave any hint 
of expecting anything but increase in 
brightness. No reinstatement of flicker 
by increasing PCF was expected. We 
have shown the expectation to be false. 

Finally, one ought to recognize there 
are three dimensions to a model and 
thus three ways in which it is to be eval- 
uated. The first is how well it handles 
the facts it presumes to deal with; the 
second is the breadth of the facts it can 
integrate; the third, its fruitfulness in 
predicting phenomena. Discussion to 
this point and  Throsby's critique 
throughout considered only the first 
dimension. Even there she did not show 
that it failed to handle the essential 
facts. It should be recognized that 
Bartley’s model does relate findings from 
two distinct areas, sensory psychology 
and neurophysiology, and that the model 
has inherent in it a number of further 
predictions which merit testing. 
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SOME EFFECTS OF LIGHT UPON THE BEHAVIOR 
OF RODENTS' 


ROBERT B. LOCKARD ? 
University of Wisconsin 


Topics discussed are light aversion, light reinforcement, self-regulated 
exposure to light, activity and activity rhythms affected by light, and 
physiological changes dependent upon light. It is argued that the effects 
of light are manifold and persistent, and thus cut across usual research 
categories; that different experimental procedures may not measure en- 
tirely separate light-controlled phenomena; and that theories of light- 
controlled behavior might profit from considering a wider scope of data 


whose common denominator is light. 


Almost 3 decades ago, Turner (1935) 
remarked upon “the use of illumination 
as a motivating factor.” Subsequent re- 
search increasingly characterized the rat 
as light aversive, and many still regard 
light as a stimulus with motivational 
properties that become increasingly 
negative as intensity increases. But in 
1953 both Girdner and Henderson 
showed that dim light produced by bar 
pressing had positive reinforcing prop- 
erties. This positive effect was identified 
with stimulus change and exploration, 
not with light per se. Hence research 
tended to persist within the categories 
of light aversion and stimulus change 
without being united by virtue of their 
common continuum. 

A third category of research concerned 
with light-controlled behavior has re- 
cently appeared which rejects dichoto- 
mous views and strives to show how 
behavior is a continuous function of 
light intensity. While the light aversion 
studies used a single lever to turn off the 
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test light and the light-reinforcement 
studies used a single lever to turn on a 
light, this third group of studies uses two 
levers which turn a test light both on 
and off. This new procedure has pro- 
vided new information and implications 
about the two older categories of 
research. 

A fourth group of studies has been 
historically distinct because bar pressing 
was not the response and the theoretical 
context was neither light aversion nor 
stimulus change. This group of studies 
concerns the effect of illumination upon 
activity and behavioral periodicities, and 
thus is relevant to all procedures using 
light as either a static variable or as the 
consequence of a response. For example, 
even brief exposures to dim light can 
have dramatic effects upon the activity 
of certain rodents (De Coursey, 1960) ; 
and, though no example is yet available 
for rodents, Wahlstróm (1960) showed 
that canaries would perch on levers in 
such a way as to put themselves on a 
diurnal cycle of light and darkness. 

Finally, there is a body of literature 
which shows that the effects of light 
extend beyond the behavioral realm into 
the very physiology of the animal. A 
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rat reared in the dark is physically differ- 
ent from one reared in light, and a 
mouse on a light-dark cycle is physio- 
logically different at noon than at mid- 
night. 

This article is concerned with the 
effects of light as a form of energy and 
not with the signal functions light may 
acquire through learning. It will be seen 
that the effects of light are often so pro- 
found and persistent that not only must 
our methodology be refined, but possibly 
our theories of light-controlled behavior 
must regard light as more than a tran- 
sient and trivial stimulation. In short, 
the five categories of research mentioned 
above may be interrelated in such a way 
that bar pressing to produce or remove 
light is a deceptively simple procedure, 
actually attended by a host of varia- 
bles cutting across traditional lines of 
research. 

In the following sections, each cate- 
gory of research is discussed with respect 
to how identified variables affect the 
behavior. When theoretical issues have 
been an important part of the literature, 
these are discussed within the area from 
which they came. Sometimes a finding 
in one category has implications for the 
procedures or theory of another, and this 
is discussed where it seems warranted, 


BRICHT LIGHT AS AN AVERSIVE 
STIMULUS 


The "negative phototropism" of young 
hooded rats was discussed by Crozier 
and Pincus (1926, 1937), who described 
a "seeking of the dark" after the eyes 
opened. "Turner (1934a, 1934b, 1935), 
working with albino pups 14-19 days 
old, clearly recognized the motivational 
properties of light as an "irritating" 
stimulus, and suggésted a relationship 
between intensity and degree of motiva- 
tion. Extending such observations to 
adult albino rats, Keller (1941) initiated 
the modern procedure by both introduc- 
ing Skinnerian operant techniques and 


manipulating illumination as a variable, 
Rats were tested individually in a box 
whose lever turned off, for 1 minute, a 
bright light ranging from 5 to 152 foot- 
candles in different tests. After a period 
of darkness, the light came on again and 
could be extinguished by the first re- 
sponse after a 15-second exposure to 
light. Both latency and frequency of 
responses were measured and used as 
indices of aversion. 

The other procedure serving as a refer- 
ence in this area requires locomotion 
from one place to another and was pio- 
neered by Keller and Oberlin (cited by 
Hanson, 1951). Hanson (1951) used 
it as a tilting-floor technique in which 
going to one end of a box turned off the 
test light, and Zeaman and House 
(1950) extended it to shuttling from one 
compartment to another. 


Variables Affecting Performance 


The one variable repeatedly investi- 
gated is intensity of illumination. Keller 
(1941) found that as illumination in- 
creased, latency of response decreased, 
and measures of response frequency mM- 
creased. Kaplan (1952) also used a 
bar-press procedure, but extended the 
range of illumination much higher. Re- 
sponse rate was not a monotonic increas- 
ing function of illumination, but passed 
through a maximum, then decreased 
somewhat. Kaplan interpreted this in 
terms of a two-factor theory in which 
the reinforcing effect of light termination 
becomes outweighed by the depressant 
effects of strong stimulation. 

Test illumination was first extended 
down into the “dim” region by Hanson 
(1951), who used the locomotion pro- 
cedure. The amount of time spent M 
the “dark” end of the box was measure 
for 15 albino rats tested in 15 different 
illuminations. When plotted as a func- 
tion of test illumination, “time spent 1n 
dark" emerges as an ogive-shaped func- 
tion; in very dim light (.038 foot- 
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lamberts), about 55% testing time was 
spent in darkness. In bright light (80 
foot-lamberts), about 9896 testing time 
was spent in the dark. This psycho- 
physical approach provided an “aversion 
threshold" of about 1 foot-lambert, 
where the rats spent 75% of testing time 
in the dark, Hanson's paper is outstand- 
ing both because of the wide range of 
luminance values used and the excellent 
discussion of methodology. For un- 
known reasons, his rats did not spend 
more than 50% testing time in the dim- 
mer illuminations; hence, Hanson missed 
the discovery of the reinforcing proper- 
ties of dim light, to be discussed later. 

While it might be assumed that bright 
light acts immediately as an uncondi- 
tioned stimulus to produce escape be- 
havior, some interesting adaptation ef- 
fects were found by Jerome, Moody, 
Conner, and Ryan (1958). Escape from 
a lighted chamber into a dark one was 
maintained in frequency when the test 
light was bright, but decreased within 
the 15-minute session for rats running 
from dim lights. Dim lights were ini- 
tially as aversive as bright ones, but dim 
light would not sustain performance 
while bright light would. Although this 
finding suggests that the way a rat per- 
forms may depend upon history of ex- 
posure to light, this variable was not 
pursued in the light-aversion context 
and is discussed later. 

Other investigators to use bright light 
as a noxious stimulus were Hefferline 
(1950), Jerome and Flynn (1950), 
Flynn and Jerome (1952), and Strange 
(1954). The gross fact of light aversion 
in albino rats is well established, seems 
positively related to light intensity, and 
is a useful technique of negative motiva- 
tion. However, one goal of this type 
research is to progress beyond demon- 
strations of a relationship between aver- 
sion indices and light intensity. Evi- 
dence to be discussed later suggests that 
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the aversiveness of a given illumination 
depends both upon the strain of the 
animal and how bright an environment 
it came from. Hence there is no single 
light-aversion curve for rats. Formula- 
tions of the future will have to include 
more variables than illumination to be 
useful in actually predicting behavior. 


Theories of Light Aversion 


No substantial body of theory has 
developed. Most research has either had 
an empirical emphasis or has merely 
classified light with other noxious stimuli 
such as shock. The underlying (but un- 
expressed) assumption seems to have 
been that aversiveness begins at some 
threshold value and increases as illumi- 
nation increases, without the function 
ever passing through a region of positive 
motivational properties. Evidence to the 
contrary is discussed in the next two 
sections. 


DEMONSTRATIONS OF POSITIVE 
REINFORCEMENT BY LIGHT 


Girdner (1953) was interested in non- 
nutritive reinforcers, Henderson (1953) 
in stimulus intensity dynamism; both 
discovered the reinforcing properties of 
dim light onset independently in 1953. 
Both procedures were essentially alike, 
and represented a sort of backwards 
version of Keller’s (1941) arrangement; 
instead of reinforcing a bar press with 
darkness, the Girdner-Henderson pro- 
cedure leaves the rat in darkness except 
when the bar is pressed. Then a dim 
light comes on for either a short, fixed 
duration or for as long as the bar is 
depressed. Reinforcement is demon- 
strated by a frequency of response 
greater than that of controls who receive 
no light consequence, or by an increase 
over a previous ^no light” operant rate. 
Both codiscoverers mentioned a stimulus 
change or novelty factor and argued 
against a secondary reinforcement inter- 
pretation. 


( 
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Further demonstrations of this effect 
were quick to follow, and emphasis 
shifted from simple demonstrations to 
determinations of how certain variables 
affect the dependent variable, response 
rate. Generally, the procedures have 
been comparable to the Girdner-Hender- 
son procedure in which a single lever 
turns on a light. The variations intro- 
duced usually consist of a treatment ap- 
plied to one group and not to another 
before or during testing; the treatment 
effect is then assayed by a comparison of 
operant rate between groups. Sometimes 
the more sophisticated parametric ap- 
proach is employed, in which the treat- 
ment is applied at different levels. The 
number of studies investigating effects 
of variables warrants grouping them 
under relevant headings. To facilitate 
discussion, the light contingent, bar press 
rate will be abbreviated LCBP following 
the convention of Premack and Collier 
(1962). 


Variables Affecting Performance 


Light intensity. Since a very bright 
light is likely to inhibit bar pressing and 
a below-threshold light cannot reinforce 
bar pressing, it is reasonable to expect 
parametric studies to reveal a concave- 
upward function relating LCBP to light 
intensity. Unfortunately, the available 
data are hardly so neat. Henderson 
(1953, 1957) found a maximum LCBP 
around 16 millilamberts and a minimum 
at either about 50 millilamberts or dark- 
ness; but between .02 and 16 milli- 
lamberts LCBP did not increase mono- 
tonically with luminance, but underwent 
a series of reversals. While the low rate 
for 50 millilamberts is striking, it re- 
mains uncertain just how the dimmer 
lights affected LCBP. 

In a similar parametric study, Levin 
and Forgays (1959) found evidence 
suggesting an LCBP maximum which 
differed for two age groups. While older 
animals responded significantly more 
than young ones, the interaction of age 
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with light intensity was not significant, 
and the main effect of light intensity 
was not significant. Certain multiple ¢ 
tests, however, suggested complex effects 
involving an interaction of age, intensity, 
and prior experience. 

Using hooded rats, Stewart (1960) 
tested across a wide range of illumina- 
tion (.01 to 8.5 foot-candles) for 17 
days. The LCBP was virtually the same 
across illumination. Thus while it is 
tempting to assume a curvilinear rela- 
tionship between LCBP and illumina- 
tion, the sole solid support for such a 
conclusion rests on Henderson's 50-milli- 
lambert group. Thus, while *differ- 
ences" in the statistical sense have been 
demonstrated, the area of light reinforce- 
ment still awaits well-controlled studies 
to determine operant rate as a function 
of light intensity. 

Deprivation of light. The effect of 
light deprivation upon LCBP is of some 
importance, for theories based upon in- 
gestion models would predict an increase 
in LCBP following deprivation. Adapta- 
tion level theory (Helson, 1959), on the 
other hand, would make a contrary pre- 
diction. If the data were to follow one 
or the other of these models, it would 
be a clue as to what kind of response 
system LCBP was. Unfortunately, the 
evidence is not only unclear, but suggests 
difficult methodological problems. For 
example, Premack, Collier, and Roberts 
(1957) reported that LCBP increased 
with deprivation, but Premack and Col- 
lier (1962) reported an interval, or “bar 
deprivation,” effect; LCBP increased 
with time between test sessions, regard- 
less of maintenance luminance. Since 
this session effect is normally confounded 
with hours of light deprivation, any find- 
ings without careful controls are of little 
value. Robinson (1960) used two groups 
kept in different illuminations and found 
no LCBP difference between them upon 
testing. 

Maintenance history. Since light de- 
privation can be indefinitely extended 


without obvious harm to the animal, a 
very long deprivation period might be 
expected to produce more potent effects 
than short-term procedures. Roberts, 
Marx, and Collier (1958) reared albino 
rats in either a darkened environment 
or a brightly lit one, then tested LCBP 
for both light onset or offset. The high- 
est responders for either direction of 
change were those which produced their 
maintenance illumination by responding. 
As in the Jerome et al. (1958) data, 
there is the suggestion that some sort 
of adaptation effect is operating. 

Food and water deprivation. The 
LCBP seems to respond to deprivation, 
but the manner in which it does so may 
be affected by static variables that differ 
between experiments, for findings differ. 
Kling, Horowitz, and Delhagen (1956) 
found lower LCBP for deprived animals, 
even though these were more active than 
sated rats during unreinforced operant 
pretests. Premack and Collier (1962) 
found the opposite; deprived rats, rerun 
through testing a second time, main- 
tained the shape of the previously deter- 
mined function but showed elevated 
LCBP all along the function, Sometimes 
hours of food deprivation is manipulated 
as a variable. Again, a "deprivation 
effect" may appear, but shows no im- 
pressive consistency across studies. Hur- 
witz and De (1958) found LCBP 
reached a minimum at 12 hours depriva- 
tion, but Davis (1958) found that LCBP 
increased monotonically with hours of 
deprivation. While there is some agree- 
ment that food-deprived rats show 
greater LCBP than sated ones, a central 
problem lies in distinguishing relative 
increases in response rate from absolute 
increases. In one study with good con- 
trols (Forgays & Levin, 1958), an effect 
of deprivation was shown in that de- 
prived rats bar pressed for light about 
1.4 times as much as sated rats. But 
since deprived controls bar pressed for 
no light more than twice as much as 
sated controls, one could view the data 
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in terms of how a contingent light in- 
creased LCBP for either deprived or 
sated rats. When thus viewed as a 
relative increase over unreinforced (no 
light) rate, the effects of deprivation 
were to lower relative responsiveness 
attributable to the contingent light. Thus 
conclusions depend in part upon the 
arithmetic chosen, and until conventions 
are established regarding relative versus 
absolute rates the effect of deprivation 
will be hard to assess. 

Reinforcement schedule. There is 
some justification for the impression that 
LCBP is a capricious and insensitive 
dependent variable, unlikely to reflect 
treatment effects in satisfactory fashion. 
A procedural technique with promising 
analytical possibilities was introduced by 
Stewart and Hurwitz (1958), who 
shifted rats from a 1:1 LCBP reinforce- 
ment schedule to either a 3:1 or 6:1. 
Unlike rats working for ingestibles, the 
3:1 group responded more than the 6:1. 
Noting a similarity between this out- 
come and performance for known weak 
reinforcers, like dilute sugar solutions, 
Stewart (1960) explored both reinforce- 
ment schedule and light intensity. Four 
groups, bar pressing for four different 
illuminations, did not differ in LCBP 
during 17 days of 100% reinforcement. 
When shifted to partial schedules which 
grew progressively “leaner” (2:1 to 6:1) 
day by day, animals working for bright 
light increased their LCBP, while ani- 
mals bar pressing for dimmer lights did 
not. The results were ordered such that 
the brighter the light, the greater the 
LCBP as reinforcement schedule became 
leaner. 

The findings of Stewart and Hurwitz 
might be summarized by saying that al- 
though a rat will work for dim light, it 
will not work very hard (perform on a 
lean schedule) unless the light is suffi- 
ciently bright. The promise of this tech- 
nique is its ability to distinguish between 
illumination groups where the traditional 
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procedure of 100% reinforcement could 
not. 

Effects across time. The “real reason” 
for reinforcement by light has concerned 
neatly every experimenter. Various 
theories will be discussed later; however, 
it should be mentioned that the time 
course of LCBP is regarded by some as 
a clue to the mechanism of reinforce- 
ment. If LCBP declines markedly across 
days, the light can be regarded as having 
a “trivial” function; and perhaps LCBP 
resembles responses to novelty. If LCBP 
is maintained or increased, the function 
resembles that for acquisition curves for 
nutritive reinforcers. 

There are about 13 sets of published 

data reporting LCBP across days. When 
these are categorized as to increasing or 
decreasing LCBP, seven show a decrease, 
five show an increase, and one shows an 
upward-concave function. The temporal 
course of LCBP must depend upon cer- 
tain variables which differed between 
experiments. One candidate is adapta- 
tion time; adapted animals might be 
expected to have few competing re- 
sponses, hence bar press at a high initial 
rate as soon as the light was connected 
to the lever. Unadapted animals would 
be likely to explore first and bar press 
later, thus giving rise to an increasing 
function. This hypothesis was proposed 
by Hurwitz (1956) and was supported 
by the data of Appel and Hurwitz 
(1959). However, when the “increas- 
ing” and “decreasing” sets of data are 
examined, the hypothesis is of little help 
in resolving between-study discrepancies. 
In fact, adapted animals may tend to 
increase LCBP across days—an outcome 
contrary to the hypothesis. Other varia- 
bles must be operating. Since different 
studies use different lights, it is possible 
that some index of incentive was differ- 
ent in various experiments, and that 
response rate is better maintained across 
days for high incentives than for low. 
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Throughout the above discussion of 
the effects of different variables, the 
general picture is that any given study 
may find “an effect”; yet the way the 
effect operates comes out differently in 
different studies. Quite possibly the 
effect operates one way at some level of 
some additional variable, and another 
way at some other level. Since two stud- 
ies in disagreement often differ in many 
ways—strain of animals, apparatus, light 
intensity, adaptation trials—it is im- 
possible to account for the disagreement. 
Without apparatus standardization and 
procedural conventions, we may have to 
be satisfied with knowing that a given 
variable sometimes has an effect without 
knowing what conditions make the effect 
increase or decrease. 


Theories of Light-Reinforced Bar 
Pressing 


Secondary reinforcement. Though 
never seriously proposed as an explana- 
tory device, secondary reinforcement via 
unintentional pairing of light and pri- 
mary reinforcement has been considered 
and rejected various times. Henderson 
(1953) argued against it, and Roberts 
(1954) made a definite attempt to a550- 
ciate light with feeding for one group 
before LCBP. Upon testing, the “asso- 
ciated” group showed the smallest first 
test day increment in LCBP. 

Using a large factorial design, Rob- 
erts, Marx, and Collier (1958) deliber- 
ately associated feeding with either light 
or dark, yet found no effect of the feed- 
ing variable upon bar pressing for either 
light-on or light-off. Kish (1955) has 
pointed out that the nocturnal habits of 
mice and rats would work against sec 
ondary reinforcement. 

Successful establishment of secondary 
reinforcement was reported by Hurwitz 
(1960a, 1960b, 1960c) and by Hurwitz 
and Appel (1959). In both cases, the 
efiects of association with food had but 
a transient effect. 
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Facilitation effects. The response fa- 
cilitation hypothesis states that stimula- 
tion increases activity. Hence activity 
in an LCBP chamber would be greater 
than activity in dark chambers because 
a light sometimes comes on. The lever 
would intercept some of the activity, 
blink the light, and thus sustain more 
activity. Nash and Crowder (1960) 
argued for the facilitation position, but 
their data are based upon nonsignificant 
t tests between small groups. The posi- 
tion of Crowder, Wilkes, and Crowder 
(1960) is that responses in extinction 
are free from facilitation effects since 
the test light is gone, and that an ade- 
quate test for reinforcement should be 
based upon responses in extinction. 

Evidence against the facilitation hy- 
pothesis is convincing. In extinction, 
Barnes and Baron (1961a) found more 
responses for previously reinforced ani- 
mals than for nonreinforced. In acquisi- 
tion, the yoked control technique intro- 
duced by Kling, Horowitz, and Delhagen 
(1956) provides simultaneous control 
animals whose light is in parallel with 
that of the experimental subjects. Thus 
pattern of stimulation, duration, and 
illumination are the same; only the con- 
tingency differs. In this experiment, the 
controls maintained their low pretest 
operant after the light was connected to 
the experimental subject's lever, thus 
showing no facilitation effect. The ex- 
perimental subjects showed the typical 
sudden increase in response rate when 
the lever was connected. 

The discrimination experiment of For- 
gays and Levin (1959) further under- 
mines the facilitation position. Two 
levers were equidistant from a light out- 
side the Plexiglas chamber; some ani- 
mals received light for pressing one, 
some, the other. Operant rate on the 
functional lever showed not only a dis- 
crimination, but also a reversal after 
10 days of reinforcement. The facilita- 
tion hypothesis would predict an increase 
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in operant rate on both levers, or on one 
nearer a light, but could not handle a 
discrimination. Were it altered to do so, 
it is difficult to see how a difference from 
operational reinforcement theory could 
be maintained. 

Stimulus-change hypothesis. This 
position emphasizes the contingency of 
a perceptual consequence upon an oper- 
ant. The consequence need not be edible, 
drive reducing, or within a particular 
modality. One of the clearest statements 
is provided by Forgays and Levin 
(1959): “when a response is followed 
by a distinctive stimulus change, the 
response level is at least maintained and 
often increased over trials. . . .” Since 
the stimulus-change position applies 
equally to increases and decreases of 
energy, it predicts that light offset would 
be as reinforcing as light onset. 

Light offset has not been found rein- 
forcing (Barnes & Kish, 1958; Hurwitz, 
1956; Robinson, 1957, 1959). The one 
exception to the rule is found in the data 
of Roberts et al. (1958) whose dark- 
reared albino rats may have bar pressed 
more than chance to turn off a 16-milli- 
lambert light. However, especially in 
view of later evidence about the effects 
of maintenance illumination, this part of 
the study might legitimately be classed 
with the light-aversion studies. 

A logical corollary of the stimulus- 
change position is that variation of stim- 
ulus consequences would sustain re- 
sponding above that for invariant conse- 
quences. This was first tested in one 
of Girdner’s (1953) pilot studies by 
comparing groups receiving light, weak 
buzzer, or light and buzzer in an alter- 
nating sequence. The alternated group 
responded more than the other two, but 
not significantly so. Morris, Crowder, 
and Crowder (1961) used a similar pro- 
cedure in which either brightness or 
position of lamps was alternated. The 
alternated groups responded no more 
than controls which received similar but 
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invariant light. Some evidence favorable 
to the stimulus-variation corollary comes 
from Barnes and Baron (1961b), who 
showed that visual patterns provided by 
LCBP may evoke different response 
rates dependent upon pattern design. 

Thus the stimulus-change position is 
rendered almost untenable by the light 
offset studies and finds but weak support 
from the stimulus-variation studies. The 
reason for the dogged persistence of this 
position probably stems from the dubi- 
ous understanding gained from connect- 
ing LCBP studies to the exploration- 
novelty area, plus the rather conspicuous 
absence of adequate, alternative formu- 
lations. A recent position, the preference 
hypothesis, came out of the group of 
studies using the two-bar technique and 
may have explanatory and predictive 
value. This is discussed under Theories 
in the next section. 

Scanning theory. Hurwitz (1956) re- 
` ported that “When the light is kept on 

for extended periods, the animal vigor- 
ously surveys the box, making rapid, 
scanning-like head movements." Robin- 
son (1961) gave theoretical stature to 
this observation by hypothesizing that 
the presence of light “provides S with 
the opportunity to obtain further re- 
inforcement by scanning the visual 
inhomogeneities of the lighted test en- 
vironment.” Thus the reinforcing prop- 
erties of light offset are canceled by the 
loss of visual targets. This appears to be 
a two-factor theory preserving the stim- 
ulus-change position but explaining the 
unsupporting light offset data with a 
second factor. 

Robinson’s test of the hypothesis in- 
volved determining the pretest operant 
rate for hooded rats, then demonstrating 
an increase in response rate for both 
light increment and decrement. Both 
directions of change proved reinforcing, 
thus lending support to the scanning 
theory. 

The discrepancy hypothesis. Lubow 
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and Tighe (1957) summarized the 
tion as “level of motivation (as indicated 
by the strength of the behavior 

measured) is a function of the di 

ancy between present stimulation and 
past stimulation.” Similar positions ame - 
discussed by Meier, Foshee, Wittrig, — 
Peeler, and Huff (1960) and by Hunt 
and Quay (1961), all of whom failed — 
to find support for the hypothesis. Ap- 
plied to the LCBP situation, the dis- 
crepancy hypothesis predicts an increase 
in operant rate as the delivered stimulus 
departs farther from maintenance il- 
lumination. Roberts et al. (1958) found 
just the opposite; the highest responders — 
were those that produced, by their re- 
sponse, their maintenance illumination. 
The findings of Robinson (1961) are 
also relevant and do not support the 
position. The discrepancy hypothesis 
has not been a prominent issue. 


Duration or Setr-Exposure TO LIGHT 


The dependent variable of nearly 
every study reported upon so far has 
been operant rate. Light, however, can- 
not be stored in the stomach; hence, 
metering it out in “pellet” form might 
be considered a strange procedure. Light 
occurs in nature and in the animal colony 
in ambient form, and one might conjec- 
ture that one road to discovering the 
mechanism of light reinforcement is in 
approaching it as an ambient condition 
rather than as a pellet. Hanson's (1951) 
tilt-floor technique did essentially this 
and provided some of the most regular 
and informative data of the literature. 
Had they been published, the course of 
research might have been changed. 

Using the single-lever LCBP standard 
technique, various investigators have re- 
ported measurements of the duration of — 
responses, Girdner (1953) presented 
graphs of both response rate and re- - 
sponse duration. Every effect found by — 
analysis of rate measures is also apparent — 
in duration measures; perhaps to 4 
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Fic. 1. Four dependent variables as a function of test luminance for groups of rats reared in 
seven different lighting conditions. (Units on the ordinate refer to mean daily performance of 
animals tested 24 hours per day. See Lockard, 1962a.) 


greater degree in some cases. Reports of 
response duration have also been made 
by Hurwitz (1956), Premack and Collier 
(1962), Roberts et al. (1958), and 
Robinson (1961). 

In 1961, the present author began a 
series of studies whose procedure was 
intended to combine the single-lever, 
light-aversion, and light-reinforcement 
procedures (Lockard, 1961). Two bars 
were provided the animal in the test 
chamber. One bar turned on the test 
light, which then stayed on until turned 
off by the other lever. Rats were left 
in the test chamber for 12 days without 
disturbance; they were never removed 
to a colony between trials. Each day of 
testing provided four measures of be- 
havior: Ons, the number of times the 
bar turning on the test light was pressed; 
Offs, the number of times the off lever 
was pressed; Changes, or the number 
of times the light was changed from off 
to on that day; and Duration, the length 
of time the test light was left on. This 
procedure and the four dependent vari- 
ables it provides is described in some 
detail because it is unusual, and because 
most of the studies discussed below used 


it. There have been enough of these to 
warrant discussing how several different 
variables affect the dependent variables 
of the two-bar procedure. 


Variables Affecting Performance 


Light intensity. Rather substantial 
data are available, for test luminance has 
been a favorite independent variable. 
Throughout the series, Duration of self- 
exposure has decreased as test lumi- 
nance * increased. Any of the lower cells 
of Figure 1 illustrate this effect. These 
data differ from Hanson's (1951) tilt- 
floor data in that the albino rats tested 
in dim light typically spend more than 
half the testing time with the light on. 
This finding has led to the hypothesis 
that as test illumination is increased 
from threshold “brightwards,” the corre- 
sponding motivational properties pass 
through a positive region before dipping 
down into the negative region of light 
aversion. This is called the prefer- 


8 The luminance values of .01, .1, etc., refer 
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ence hypothesis and is discussed under 
Theories. 

The function relating the other three 
variables (Ons, Offs, and Changes) to 
test luminance was sometimes orderly 
within one experiment, but was differ- 
ently shaped for each experiment. This 
embarrassing resistance to replication 
was finally traced to differences between 
the animals and is discussed next. 

Maintenance history. Certain ship- 
ments of animals seemed to have a “tem- 
porary preference” for bright light, for 
they would leave the bright overhead 
light of the test chamber on about 22 
hours per day for the first few test days, 
then abruptly shift to low Durations of 
less than 1 hour. A visit to Sprague- 
Dawley revealed that the colony was 
illuminated 24 hours per day by large 

overhead fluorescent fixtures, and that 
some cages (the upper ones) were much 
more illuminated than others. Since all 
animals in a shipment are likely to come 
from the same cage, animals in the dif- 
ferent experiments could differ appreci- 
ably in their lighting history. 

This idea was tested (Lockard, 1962b) 
experimentally by maintaining two 
groups of rats in differently illuminated 
environments for 12 days, then testing 
both groups at each of five test lumi- 
nances by the two-bar procedure. The 
two differently maintained groups dif- 
fered appreciably in both Duration and 
Changes. But while an effect has been 
demonstrated, the job of relating main- 
tenance history as a continuous variable 
to test performance was done in a sub- 
sequent experiment (Lockard, 1962a). 
Male albino rats were reared from wean- 
ing in one of the seven conditions shown 
across the top of Figure 1, then tested 
by the two-bar procedure for 12 days 
in chambers whose light provided one of 
five luminances (.01—100 millilamberts) . 
Twenty rats were reared in each of the 
seven conditions; these were tested, four 

at .01 millilambert, four at .1 millilam- 
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bert, etc., for a total of 140 animals. 
Figure 1 shows the four dependent vari- 
ables of this study, each as a function of 
test luminance within each of the seven 
rearing conditions. 

The real point of Figure 1 is that 
animals with different lighting histories 
perform differently in the same test sit- 
uation. For example, rats reared in the 
light-dark cycle show Offs decreasing 
with test luminance, while rats reared 
in bright light (100 millilamberts) show 
just the opposite. Although Changes are 
plotted on an unflattering scale, it can 
be seen that they maximize at 1.0 milli- 
lambert for dark-reared rats, yet increase 
with test luminance for rats in the 100- 
millilambert condition. In the lower cells 
it can be seen that a test luminance of 
1.0 millilambert is kept off most of the 
time by the animals reared in dim light, 
yet is kept on more than 12 hours (720 
minutes) by rats reared in the brighter 
conditions. Thus whether or not a given 
light is aversive depends upon the light- 
ing of the environment from which the 
animal came. 

Strain differences. Although most of 
the research discussed in connection with 
the single-lever, light-aversion, and 
light-reinforcement procedures used al- 
bino rats, pigmented rats are sometimes 
used (Hurwitz, 1956; Robinson, 1959, 
1961; Stewart, 1960). Pigmented-albino 
strain differences have been noted in 
related situations (Carr, 1957; Lashley, 
1930; Smith, 1960) ; therefore Lockard 
(1962c) compared Long-Evans hooded 
rats with Sprague-Dawley albinos with 
the two-bar procedure. Duration is shown 
in Figure 2 as a function of test lumi- 
nance for each of the two strains. The 
hooded rats show no evidence of light 
aversion, for bright test lights were kept 
on about as long as dim ones. The al- 
binos showed the typical decrease in 
Duration as test luminance increased. 

A later experiment. (Lockard, unpub- 
lished) implicated the eye pigmentation 
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Fic. 2. Preference functions for hooded and 
albino rats kept in darkness or light for at 
least 12 days prior to testing in the two-bar 
procedure (Lockard, 1962c). 


as the major difference between strains. 
Pigmented male rats that were heterozy- 
gous for the gene allowing expression of 
color were crossed with Sprague-Dawley 
albino females. The resulting litters were 
about half pigmented, half albino. Pairs 
of brothers were then reared together, 
a pair consisting of one albino and one 
pigmented rat from the same litter. At 
maturity, pairs were tested in adjacent 
compartments by the two-bar procedure. 
The pigmented rats were much less light 
aversive than their pink-eyed albino 
brothers. 

Effects across time. The temporal 
course of the Ons, Offs, Changes, and 
Duration measures from the two-bar 
procedure is orderly, but depends upon 
test conditions, Without any pretraining 
on a “dummy” lever put in the rearing 
cage before testing, about 20% of the 
animals put into test chambers fail to 
bar press for several days. Then a great 
increase in operant activity occurs which 
slowly decays across days. Animals pre- 
trained on a dummy lever (which only 
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clicks) begin bar pressing at a high rate 
the first test day, then decay to an as- 
ymptote that remains about the same for 
weeks. Thus the temporal course of op- 
erant activity could appear to increase, 
decrease, or assume a concave shape 
depending upon the strength and persist- 
ence of variables acting to inhibit oper- 
ant activity. 

The animals whose data are shown in 
Figure 1 were pretrained on a dummy 
lever, hence, showed high first-day op- 
erant activity which decayed to an as- 
ymptote within the first six testing days. 
All the data shown in Figure 1 are based 
upon only the last six experimental days; 
performance in the act of stabilizing was 
neglected in favor of stabilized perform- 
ance. Digressing for a moment to a for- 
mer topic, it is interesting to note that 
the effects of rearing condition apparent 
in Figure 1 are effects measured 6-12 
days after removal from the rearing en- 
vironment, not right afterwards. 


Theory of Self-Regulated Exposure to 
Light 


Preference theory. Solutions of sugar, 
saccharine, and salts are ingested at a 
greater-than-water rate that depends 
upon the concentration of the solute. 
When “amount consumed” is plotted as 
a function of concentration, the resulting 
curve is commonly called a preference 
function. By analogy to this convention, 
Duration of self-exposure to light plotted 
as a function of intensity is also called a 
preference function. This device serves 
to eliminate the categorical boundary 
between the light-aversion studies and 
the light-reinforcement studies, and to 
suggest the direction of a quantitative 
theory uniting two formerly discrete 
research areas. It also offers an explana- 
tion of the reinforcing properties of dim 
light; darkness has a lower incentive 
value than dim light. The reinforcement 
stems from the fact that a response 
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creates a situation higher on the prefer- 
ence hierarchy than does nonresponding. 

Thus a considerable portion of the 

LCBP studies can be reinterpreted as 
preference tests between some illumina- 
tion and darkness. They therefore prove 
nothing about stimulus change because 
the change has always been confounded 
with the production of a dim ambient 
illumination of high preference value. 
The adequate test of the stimulus-change 
position would require that illumination 
both before and after the change be the 
same. Perhaps some sort of pattern 
change would accomplish this. 

The preference theory is complicated 
by the already discussed fact that dif- 
ferent light histories determine subse- 
quent preference behavior. Thus there 
is no single preference function for light, 
but rather a family of functions with 
some index of light history as the pa- 
rameter. A second difficulty is pro- 
cedural, in that the functions shown were 
determined by long-term procedures. 
Tests of reinforcing effect, based upon 
short-term procedures, would not neces- 
sarily result in an identical function. 
This is an empirical problem of concern 
beyond effects of light, for the basic 
issue is the relationship between prefer- 
ence value and efficacy of reinforcement, 

Adaptation effects. While not consti- 
tuting a theory so much as indicating 
evidence, a progressive shift in adapta- 
tion level (Helson, 1959) whose tem- 
poral course is unknown can be inferred. 
The severe and persistent effects of main- 
tenance illumination are one extreme; 
but each short exposure to light, or the 
time spent in the animal colony between 
trials, could shift the adaptation level 
and affect the incentive properties of the 
light. This item is important to meth- 
odology and needs more data before it 
can be properly discussed. 

Homeostatic behavior. Animals in two- 
bar test chambers with very bright lights 
(100 millilamberts) characteristically ex- 
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pose themselves to roughly 10 minutes 
of light per day, yet produce about as 
many Changes (periods of light) as ani- 
mals with dim lights. Why the rat should 
turn a bright light on at all is puzzling, 
since the onset should be aversive. Per- 
haps the light is “needed for something” 
in a physiological sense. In fact, the 
constancy of the Duration from day to 
day resembles the constancy found in 
known homeostatic systems like the reg- 
ulation of daily caloric intake. Later 
sections will discuss the effects of light 
on body weight and other physiological 
functions. At present, only a vague con- 
jecture in this direction can be made. 


EFFECTS OTHER THAN OPERANT RATE 
oR DURATION oF SELF-EXPOSURE 
DEPENDENT UPON LIGHT 


The research and theories discussed 
so far seem characterized by the assump- 
tion that the effects of light are essen- 
tially nonphysiological in nature. It 
seems to be regarded as something to 
see by unless it is too bright, in 
which case electrophysiological tech- 
niques would probably discover ac- 
tivity in pain fibers. Or light may be 
regarded as a mere change, a sort of 
trivial fluctuation of environmental stim- 
uli which can be classed with informa- 
tional sensory events more than with 
energy. The remainder of this paper 
makes the point that light affects the 
activity of rodents, acts to synchro- 
nize behavioral periodicities, and can 
markedly affect endogenous rhythms in 
even dim, brief doses. Finally, evidence 
is presented that light and light cycles 
have appreciable physiological effects. 


General Activity and Running Speed 


Running speed appears to be related 
to illumination, but the task of sorting 
out the effects of absolute illumination 
from transitions remains. Henderson 
(1953, 1957) found that runway running 
speed increased to a maximum in very 
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dim light, then decreased as runway 
illumination increased. But Weyant 
(1959) varied both adaptation illumina- 
tion and runway illumination, and found 
no effect of absolute illumination; in- 
stead, a change from adaptation illumi- 
nation to a brighter runway significantly 
affected running. Weyant, however, used 
hooded rats while Henderson used al- 
binos; thus neither study is a replication 
of the other. A strain difference appears 
quite likely, for Smith (1960) found that 
albinos pulled harder the nearer they 
approached a lighted goal but hooded 
rats pulled less as the goal was ap- 
proached. When the light was signifi- 
cantly reduced, the two strains pulled 
alike and neither pulled increasingly 
harder as the goal was neared. 

The runway data of Weyant are re- 
lated to the activity measures of Lubow 
and Tighe (1957). Albino rats were 
adapted at some illumination, then 
tested for locomotor activity at less, the 
same, or a greater illumination. Activity 
was not affected by adaptation-to-test 
illumination decreases, but was increased 
by illumination increases. The authors 
concluded that change as a variable is 
effective on an intensity continuum only 
when it is operating in an increasing di- 
rection. This is an interesting parallel to 
the operant data already discussed show- 
ing that light offset fails to motivate bar 
pressing. 

The activity on a dummy lever put 
into the rearing cage of albino rats was 
measured by Lockard (1962a) over the 
12 days of rearing preceding testing. As 
the rearing illumination for different 
groups increased, bar pressing (for only 
a click) decreased. Rats being reared 
in a light-dark cycle, whose light phase 
was 100 millilamberts, resembled the 
dark-reared group and not the 100-milli- 
lambert group. These results clearly 
show an effect of absolute illumination, 
though no understanding of transitional 
effects is gained. 
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Transient relative increases of illumi- 
nation appear to have some definite ef- 
fect, along with absolute illumination. 
This area awaits large-scale studies with 
careful methodology to untangle relative 
from absolute effects for both albino and 
pigmented strains. 


Synchronization of Behavioral Periodic- 
ity with Light-Dark Cycles 


Much of the literature relating activ- 
ity to cyclic lighting is summarized by 
Munn (1950). There is little doubt that 
maximum activity of rats occurs during 
the dark phase, although there may be a 
cyclical component with a period less 
than 24 hours (Szymanski, 1918). Fur- 
thermore, rodents active during the day, 
such as chipmunks, would probably show 
just an opposite activity rhythm from 
nocturnal rodents. 

The cyclic activity pattern of rats 
seems to have both an internal and an 
externally sensitive component. Both 
Hunt and Schlosberg (1939) and Brow- 
man (1937) reversed the light-dark cycle 
and found a persistence of the rhythm 
together with a gradual reversal, such 
that after 5 days to 2 weeks the activity 
maximum again occurred during the dark 
phase. Blinded rats tended to keep up a 
diurnal activity rhythm, but drifted out 
of phase with the lighting cycle and out 
of phase with one another. Their “night” 
and “day” activity also showed less dif- 
ference than that for sighted rats on a 
light-dark cycle. The predominant syn- 
chronizing effect of external stimulus fac- 
tors was shown by Browman (1952), 
who put rats on an artificial 16-hour day. 
Within about 5 solar days the former 24- 
hour rhythm was disrupted and replaced 
by a 16-hour rhythm. Activity rhythms 
in the wood mouse and a species of vole 
have been modified by Miller (1955), 
but Vaughan and Hansen (1961) found 
no synchronization for gophers. These 
animals tended to have a series of short 
active periods evenly distributed over 
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each 24 hours. Several thousand field 
trappings further indicated that the ani- 
mals might be active at any time of day. 
Hansen (1957) found somewhat the 
same nonsynchronous characteristic for 
lemmings. 

It seems clear that while activity can 
synchronize with light-dark cycles and 
certainly does so for laboratory rats, 
other species may not at all fit a rat 
model. Applications to laboratory meth- 
odology may come from future research; 
for example, darkness could possibly be 
aversive during the period normally oc- 
cupied by the light phase of a lighting 
cycle, but not during the dark phase. 
In addition, the diurnal activity varia- 
tion has implications for any experimen- 
tal program applying treatments to rats 
at different times of the day. 


Endogenous Rhythms and their Modi- 
fication by Light 
When certain rodents are kept in the 
dark for long periods of time, a daily 
periodicity of activity occurs which seems 
to persist indefinitely. The onset of ac- 
tivity, particularly running in a wheel, 
is often more clearly evident than the 
end of the activity period. Johnson 
(cited by Laurens, 1933) found alternate 
periods of activity and inactivity of 
about 12 hours each for wild mice. Ac- 
tivity periods do not coincide with either 
solar day or night, but drift slowly round 
the clock because the interval between 
onset of activity periods is rarely exactly 
24 hours, Although this interval, or wave 
length of the cycle, is probably influenced 
by several environmental factors, light 
has been definitely shown to affect the 
timing mechanism. One common proce- 
dure is to demonstrate that the wave 
length of the activity cycle changes when 
constant darkness is replaced by con- 
stant light (Hemmingsen & Krarup, 
1937; Pittendrigh & Bruce, 1959; Raw- 
son, 1959). Johnson (1939) manipu- 
lated illumination as a variable and 
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found that the brighter the constant 
light, the greater was the delay of activ- 
ity every day. 

Short exposures to dim light were 
shown by De Coursey (1960) to both 
advance and delay the onset of running 
of flying squirrels. The animals were 
maintained in light-tight cabinets at a 
uniform temperature, and the clocklike 
wave length of the activity rhythm de- 
termined for each animal in constant 
darkness. Then light shocks (10 minutes 
of .5 foot-candle) were introduced, using 
time before or after the onset of running 
asa variable. The maximum delay in the 
onset of activity occurred when the light 
shock coincided with the onset of run- 
ning. Later light shocks had less and less 
effect until the shock was applied about 
7 hours after the onset of running. In 
that case, the next period was advanced. 
This work provided a significant advance 
in that De Coursey's parametric ap- 
proach disclosed the quite orderly rela- 
tionship between time of light shock and 
its effect upon the "biological clock" of 
the flying squirrel, Her work is also im- 
portant in showing how profound and 
persistent an effect is produced by à 
remarkably small amount of energy. 

In De Coursey's procedure, the experi- 
menter produces the light; in bar press- 
ing studies, the subject produces the 
light. It would not be unreasonable to 
expect self-produced periods of light and 
dark to bear an orderly relationship to 
the circadian locomotor  periodicity 
characteristic of many rodents. Were 
this so, the motivation for bar pressing 
for light would be associated with the 
maintenance of a rhythmic system; and 
the light begins to resemble a “primary 
need" more than a mere sensory change. 


Some PnuvsroLocrcCAL CHANGES 
DEPENDENT UPON LIGHT 


The purpose of this section is to sug- 
gest the possibility that the motivational 
structure underlying light controlled be- 
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havior may involve physiological mech- 
anisms which are optimally maintained 
by light. Some “optimal quantity" of 
light may be necessary if body weight 
and health are to be normal; thus the 
work performed for light would be analo- 
gous to the role played by the striped 
muscle system in maintaining other phys- 
iological systems in equilibrium, as eat- 
ing behavior maintains caloric intake and 
hence keeps numerous metabolic param- 
eters within tolerable limits. 

No pretense of more than sampling the 
considerable literature is made. En- 
trances into the literature can be made 
through general references (Bissonnette, 
1936; Blum, 1941; Hammond, 1954; 
Hendricks, 1956; Laurens, 1933; Mc- 
Elroy & Glass, 1961; Mast, 1911; 
Rowan, 1938). 

What might be called extreme treat- 
ments involving sunlight and arc lamps 
have pronounced effects (Laurens, 1933) 
upon the skin and its pigmentation, 
wounds and their healing rate, blood 
pressure and pulse rate, and upon metab- 
olism. Of more interest within the pres- 
ent context are the physiological effects 
of much less extreme treatments, such 
as shifting animals to a dark environ- 
ment, or being returned to laboratory 
lighting after a stay in the dark. After 
discussing many such experiments, Lau- 
rens summarizes as follows: 


when animals are first removed to an environ- 
ment of constant and practically absolute dark- 
ness there are temporary changes in the con- 
centration and proportion of certain body 
constituents with a gradual return to normal 
[p. 256]. Probably, any deviation from the 
usual, insofar as radiant energy is concerned, 
acts as a stimulus which disturbs the metabo- 
lism of these constituents [p. 259]. In con- 
clusion it may be said that the results of this 
series of experiments, including those on nutri- 
tion and growth and on organic constitution, 
show that, under appropriate conditions, it is 
highly probable that radiant energy, of any 
kind, is capable of producing a biological ef- 
fect even though the energy involved may be 
exceedingly small [p. 264]. 
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Some Effects of Maintenance Illumina- 
tion 


Laurens (1933) discussed an experi- 
ment by Luce-Clausen (cited by Laurens, 
1933), who demonstrated a growth-pro- 
moting effect of red light on rachitic rats. 
She found that 10-minute daily exposure 
to a band of radiation from 720 to 1,120 
millimicrons stimulated growth and pro- 
longed the survival period of rats on a 
rickets-producing diet. Rickets stems 
from avitaminosis D; and vitamin D, 
or a substance much like it, is synthe- 
sized in the skin by ultraviolet radiation 
between 297 and 310 millimicrons. Lau- 
rens also discussed an experiment by 
Ludwig and von Ries (cited by Laurens, 
1933), who 


also reported that the size and weight of rats 
grown under visible red rays were better by 
80 grams than in the controls. The size and 
weight of the young exposed to blue light 
were also greater than those of controls but 
never reached those of young exposed to red 


light. 


Watanabe (1937) reported that red light 
regulates the irregular sexual cycle of 
female rats, while blue light stops or 
upsets a sexual cycle that is regular to 
begin with. Also, a report from Algeria 
(1952) stated that the milk secretion of 
female rats was greater for animals kept 
in red light than for those kept in green 
light or in daylight. But Munn (1950) 
points out that such findings “would be 
of little significance, however, unless the 
colors were equated in brightness value 
for the rat’s eye.” 

Turning from colored light to white 
light, Browman (1937) reported that a 
light-dark cycle produced normal estrous 
cycles of female rats, while continuous 
light produced marked estrous disturb- 
ances in about half of the animals. Thus 
cyclic lighting appears to regulate not 
only activity rhythms, but cyclic hor- 
monal activity. 

Rearing illumination seems positively 
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related to body weight and rate of sexual 
maturation. Fiske (1941) raised rats of 
both sexes in either light or dark, Light- 
reared rats showed a higher growth rate 
of pituitaries, ovaries, uteri, testes, and 
seminal vesicles. Females reared in 
light came into sexual maturity 6 days 
earlier than those kept under normal 
laboratory conditions and 16 days earlier 
than females reared in darkness. A simi- 
lar finding was discussed by Hendricks 
(1956). When rearing illumination was 
treated as a variable in six steps from 
darkness to 116 foot-candles, Lockard 
(19622) found that body weight of male 
albino rats increased with increased il- 
lumination. Rats reared in a light-dark 
cycle, whose light phase was 116 foot- 
candles, were as heavy as rats reared in 
constant 116 foot-candles. 

It is not clear how rearing illumina- 
tion, especially darkness, affects develop- 
ment of the visual system, The actual 
structure of the retina appears to develop 
normally in dark-reared rats (Detweiler, 
1943), but Liberman (1962) reported 
evidence of lower acetylcholinesterase 
retinal content for dark-reared rats. 
Liberman does not indicate whether this 
appeared to be a chronic condition, or 
whether a few hours in the light would 
make dark-reared retinas indistinguish- 
able from light-reared. 

From a methodological standpoint of 
especial interest to the psychologist, it 
appears improbable that experimental 
animals reared in different illuminations 
are, upon maturity, different only in 
*visual deprivation," Appreciable phys- 
iological differences are likely, and ex- 
treme differences are possible with stock 
diets whose vitamin D content could pro- 
duce healthy light-reared rats and ra- 
chitic dark-reared ones. One avenue of 

escape from the dilemma of rearing ef- 
fects is suggested by the eye-occluder 
technique developed by Mishkin, Gun- 
kel, and Rosvold (1959) and used on 
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rats by Siegel (1960). Opaque plastic 
hemispherical contact lenses are pressed 
over the animal's eyes and stay in place 
indefinitely. Thus different visual groups 
could be reared in the same illumination, 
even together in the same cage. 


Seasonal or Long-Term Effects 


Certain seasonal behavior, such as mi- 
gration and reproduction, is controlled 
by progressive changes of the light:dark 
ratio within the diurnal cycle. For cer- 
tain submammalian species, the relation- 
ship is well understood. Within rodents, 
however, the quantitative details are not 
yet worked out and the picture is com- 
plicated by the different mechanisms 
operating in different species. Wells 
(1959), for example, established that the 
sexual development of the ground squir- 
rel was independent of day-length 
changes but was temperature sensitive. 
But in the field mouse, gonad size and 
reproductive behavior decreased with 
gradually decreasing day length (Baker 
& Ranson, 1932) and increased with in- 
creasing day length (Whitaker, 1936). 
In other words, sexual development and 
reproduction can be experimentally ma- 
nipulated such that mice will come into 
estrus in winter, when they are normally 
in anestrus, and vice versa. Braden 
(1957) showed that different strains of 
mice respond differently to the same 
treatment, and Miller (1955) showed 
that voles respond differently than mice. 
Thus while long-term changes in the 
light:dark ratio have undeniable ef- 
fects, the direction and magnitude for 
each species may have to be worked out 
separately. 


Circadian Effects 


Physiological effects attributable to 
lighting within a 24-hour period are 
distinguished from seasonal effects by 
demonstrating that certain dependent 
variables assume different values in Cy- 
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clic fashion during the light-dark period, 
Halberg and his associates (Halberg, 
Barnum, Silber, & Bittner, 1958; Hal- 
berg, Halberg, Barnum, & Bittner, 
1959; Halberg & Visscher, 1953; Hal- 
berg, Visscher, & Bittner, 1954) at Min- 
nesota have published a series of papers 
dealing extensively with circadian peri- 
odic variation of physiological function. 
Only a brief summary which does an in- 
justice to the quantity and quality of 
the work is possible here. The reader 
may enter the literature via the refer- 
ences (e.g., Halberg & Barnum, 1961). 

The standard procedure is to maintain 
mice on a rigid 12-hour light, 12-hour 
dark cycle for at least a week; then mice 
are killed at intervals throughout a 24- 
hour period and the tissues fixed or other 
. assay techniques performed. The de- 
pendent variables are measures of eosin- 
ophil density; glucose and “corticos- 
terone” blood content; mitoses in the 
skin of the ear and in liver tissue; and 
liver metabolism of ribonucleic acid and 
DNA. When each of these dependent 
variables is plotted with respect to the 
time of day the sample was taken, the 
result in each case is a rhythmic daily 
function. ‘The various physiological 
systems sampled do not necessarily 
reach maxima at the same time of day. 
When the light-dark cycle is reversed, 
the rhythms correspondingly shift, show- 
ing the lighting to be the synchronizer. 
The rate of resynchronization is not the 
same for the various measures. For ex- 
ample, the mitotic activity of the liver 
resynchronizes to a reversed light cycle 
at a faster rate than the ear mitotic 
cycle. 

A somewhat crude summary of the 
physiological effects of light is that ani- 
mals have to be kept in one or another 
lighting condition; and whatever the 
choice may be, it constitutes an effective 
treatment whose effects are usually ig- 
nored. Changes in the lighting routine 
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and the transport of animals from one 
illumination to another constitute fur- 
ther treatment whose effect could become 
confounded with that of the experimen- 
tal situation to which the animals were 
transported. Even when all factors were 
neatly counterbalanced with a given 
experiment, the levels of potent static 
variables could differ enough between 
laboratories to make the same study 
come out two different ways. For ex- 
ample, at Wisconsin the animal colony 
lights are kept on 24 hours a day to con- 
form with the conditions at Sprague- 
Dawley. Elsewhere, rats usually live on 
a light-dark cycle. Any experiment 
whose outcome was likely to be influ- 
enced by activity rhythms, reproductive 
condition, or cyclic physiological factors 
might well provide one set of results for 
constant-ight rats, another for light- 
dark cycle rats. 


CoMMENTS ON METHODOLOGY 


Several historically separate categories 
of research have been discussed in terms 
of how identified variables operate within 
a particular procedure. These categories 
are preserved in the current literature 
because of their separate origins, the 
separate theoretical issues associated 
with each, and their methods. Historical 
origin is insufficient justification for the 
categorical boundaries. Theory within 
each category lacks convincing sub- 
stance, and might well profit from hy- 
bridization across boundaries. Thus 
method differences mainly define the 
categories; and it might be asked to 
what extent these methods actually 
differ. 

Method differences among categories 
are based upon three types of variables— 
independent, dependent, and procedural. 
Independent variables are quite similar 
across methods and thus do not distin- 
guish them from one another. Depend- 
ent variables differ according to what 
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responses are selected for measurement 
and therefore in what responses go un- 
measured. The criterion of how many 
treatment effects go unmeasured may 
affect the experimenter, but not the 
subject, and hence does not distinguish 
between experimental methods, Thus 
the procedural variables alone act to 
maintain research in categories; yet 
these variables often differ in but a 
binary way. For example, one binary 
difference is whether a response turns a 
test light on or off. Another is whether 
the light stays on only with, or without, 
continuous responding. A third differ- 
ence is whether the experimenter or the 
subject controls the light, and a fourth 
is whether test sessions are long or short. 
Thus it could be profitable to conceive of 
each established procedure as a special 
case of a generalized procedure with the 
above binary or continuous dimensions. 
For example, a typical LCBP procedure 
results from selecting a particular com- 
bination of the binary procedural dimen- 
sions: the response turns the test light 
on, the light stays on only with continu- 
ous responding, the subject controls the 
light, and the test sessions are short. 
The issue here is that research cate- 
gories would seem best based upon the 
extent to which they dealt with different 
phenomena. If the same basic behavioral 
phenomenon takes on different quantita- 
tive values as, say, illumination is varied, 
the tradition of parsimonious formula- 
tions is better served by one research 
category than by two. But to the extent 
that different procedures untangle differ- 
ent phenomena, the resulting research 
categories would be testimony to a 
greater understanding. At present, it is 
not clear whether light-controlled be- 
havior is a collection of separate phe- 
nomena appropriately pursued by ex- 
isting separate procedures, or whether 
present categories of research are in- 
herited liabilities. 


ROBERT B. LOCKARD 
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THE COLOR PYRAMID TEST: 


A NONVERBAL TECHNIQUE FOR PERSONALITY ASSESSMENT t 


K. WARNER SCHAIE 
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A description of the Color Pyramid Test (Farbpyramidentest) and its 
interpretive rationale which has received much attention in the German 
psychological literature but is almost unknown in this country. Review 
of the research literature suggests that: the CPT is about as reliable as 
most of the personality inventories in current use; it is easy to administer 
and is applicable over a wide range of ages, educational and cultural 
backgrounds; it is useful for the gross differentiation of groups with 
deviant personality characteristics and for obtaining information on the 
control of affect and other behavior traits in individual Ss. Although 
some of the reported studies lack adequate experimental controls and 
statistical treatment the CPT promises to be a useful tool for research 


and clinical personality assessment, 


One of the major technical problems 
limiting the range of applicability of 
most of our major diagnostic instru- 
ments for personality study is their re- 
quirement that the subject must react 
to a standard set of stimuli by means of 
a verbal response. This requirement 
restricts the range of responses availa- 
ble from persons with limited vocabu- 
laries (e.g., children, bilingual subjects, 
etc. and impairs the usefulness of 
many techniques with subjects who are 
unable or unwilling to communicate by 
verbal means. Another limitation of 
personality tests requiring verbal re- 
sponse is the difficulty of constructing 
such tests so as to be truly objective in 
the sense that they do not involve self- 
appraisal (Cattell, 1959), Even in the 
projective techniques the subject gen- 
erally has the benefit of clues inherent 
in the stimulus structure and may or- 
ganize his verbal response to conceal 
some of the very personality attributes 
which are to be assessed, 

Our criteria for an objective nonver- 
bal personality assessment technique 


1 Preparation of this paper was facilitated 
by financial support granted by the University 
of Nebraska Research Council which is grate- 
fully acknowledged. 
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therefore require a procedure which does 
not involve introspective self-evaluation, 
where the subject does not know (and 
where the stimulus structure provides 
no clues) what aspects of his perform- 
ance will be interpreted, and where no 
verbal response is required. Recent dis- 
cussions on the unimportance of test 
item content (Berg, 1955, 1959) sug- 
gest that many more operations than 
have hitherto been utilized might be 
found to meet these requirements. In 
fact any set of stimuli which can elicit 
a wide variety of responses will be a 
suitable vehicle provided that (a) the 
subject cannot infer the relation be- 
tween his response to the stimulus and 
the attribute to be predicted therefrom; 
(b) the subject's response to the stimu- 
lus does not occur randomly, but in à 
consistent manner showing stable 4550- 
ciations to the behavior to be predicted; 
and (c) that the range of individual 
differences be sufficiently large above 
and beyond such consistent associations 
to permit clinical application. 

One of the most promising stimulus 
variables likely to meet the above CTI" 
teria is the use of response to color OF 
color preference. The literature on color 
preference and its cultural and biologi- 
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cal concomitants has been reviewed by 
Pressey (1921) and by Norman and 
Scott (1952). Most of the older studies 
utilize terms which lack operational 
meaning, are largely anecdotal, and are 
not designed with the express purpose of 
relating response to color to personality 
evaluation, Some of the more recent 
studies concerned with the semantic 
connotation of color and the attribution 
of affective states to different hues, 
however, suggest consistent and stable 
associations (Schaie, 1961b; Tannen- 
baum & Osgood, 1957; Wexner, 1954). 
There is evidence also that individual 
differences beyond group consistencies 
are sufficiently large to indicate potential 
clinical utility (Schaie, 1961a). 

Most of the studies in the English 
language literature on response to color 
are concerned with semantic connota- 
tions and group differences in a specific 
preference, but with the exception of 
investigations of the response to color 
on the Rorschach (Siipola, Kuhne, & 
Taylor, 1950) and a monograph on the 
color preference of psychiatric groups 
(Warner, 1949), no attempt is found 
involving the systematic exploration of 
response to color as a personality assess- 
ment technique. The German literature 
on the other hand contains at least two 
systematic procedures for utilization of 
response to color as a clinical tool (Lue- 
scher, 1950; Pfister, 1950). One of 
these techniques, the Color Pyramid 
Test (Farbpyramidentest) (CPT), al- 
though originally proposed primarily as 
a new projective method for the study 
of affect, also appears to meet the 
criteria specified above for an objective 
nonverbal personality assessment tool. 

The CPT was originally devised by 
the Swiss psychologist Max Pfister 
(1950). Most of the systematic de- 
velopment work, however, was con- 
ducted at the Psychological Institute 
of the University of Freiburg by Robert 
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Heiss, Hildegard Hiltmann, and their 
students. Much of the work on this 
technique is therefore found in doctoral 
dissertations at Freiburg. A respectable 
volume of published research has also 
been accumulated on the technique dur- 
ing the decade since its introduction but 
almost none of this material has been 
published in English. It is the purpose 
of the present paper to bring this promis- 
ing technique to the attention of Ameri- 
can psychologists and to review the 
relevant research literature. 


DESCRIPTION OF THE TECHNIQUE 


The material used for the test con- 
sists of a piece of white paper on which 
a pyramid has been drawn which leads 
in five steps from the base to the apex 
(see Figure 1). The pyramid contains 
15 fields, each of which is 1 inch square. 
In addition to this form there are many 
pieces of colored paper in 24 different 
hues, also 1 inch square. A record form 
contains six small replicas of the pyra- 
mid and various tabulations which fa- 
cilitate recording and scoring.” 


2The test materials (including the colored | 
gummed paper, test manual, and record forms 
in German) are available from the publishers, 
Hans Huber, Bern, Switzerland. A total of 
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The instructions for administering 
the test are simple and require a mini- 
mum of training beyond the usual con- 
siderations of good testing practices. 
'The examiner places the pyramid form 
and the shuffled colored papers before 
the subject. He then gives the follow- 
ing instructions: 

Take some of these colored chips and place 
them on the fields of the pyramid in any way 
you prefer. You may exchange the chips on 
the pyramid in any way you please. Be sure 
to make as pretty a pyramid as you can. Tell 
me when you are finished, but take your time, 
as it is not important to finish quickly. 

When the subject indicates that he 
has finished, the color of each chip on 
the pyramid is recorded on the cor- 
responding fields of the pyramid on the 
examiner's record form. The chips are 
then returned to the container, shuffled, 
and the subject is instructed to arrange 
another pyramid. The instructions are: 


Now make another pyramid. Make it as 
pretty as you can. 


After the second pyramid has been 
completed, the colors are again recorded 
by the examiner and the subject is asked 
to arrange a third pyramid. After the 
third pyramid has been completed and 
recorded the examiner instructs the sub- 
ject: 

Now make a pyramid which is as ugly as 
you can make it. 


After the first ugly pyramid is com- 
pleted the colors are again noted and 
the subject is asked to arrange a second 
and finally a third “ugly” pyramid. A 
total of six pyramids are thus required 
from each subject. 


21 chromatic and 3 achromatic hues are used 
and the chromatic hues are classified further by 
major color, Thus there are four reds, greens, 
and blues; three purples; and two oranges, 
yellows, and browns. Throughout this paper 
we shall use “hues” to refer to the 24 distinct 
stimulus shades and “colors” to refer to the 
10 color groups used as major scoring cate- 
gories. 
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In order to permit the subject to 
choose any combination of colors, it is 
necessary to have as many colored chips 
available as there are fields in the pyra- 
mid. This means that there must be 15 
pieces of each hue. Precautions need 
be taken to avoid examining color-blind 
subjects. It is desirable therefore to 
use a device such as the Ishihara plates 
to pretest all subjects. Since the results 
of the CPT with the color-blind would 
require alternate norms (which might 
have to differ with respect to the vari- 
ous types and degrees of color defect) 
their study by means of this test is 
probably best avoided. 


SCORING PROCEDURE 


The first step in scoring the CPT 
consists of counting the frequency of 
occurrence of each hue in each of the 
pyramids and recording these frequen- 
cies in the appropriate column of the 
record sheet. Frequency of occurrence 
is then totaled separately for the three 
pretty and three ugly pyramids for each 
hue and for each color. As a result of : 
these operations one obtains the rank 
order of frequency (called "color for- 
mula" in the German literature) with 
which the various colors were chosen. 
It has also been customary to trans- 
form the raw frequencies into percent- 
ages prior to further analysis. This 
latter step may seem unnecessary, how- 
ever, and American psychologists will 
probably prefer using the raw scores di- 
rectly to compare them with normative 
data expressed in some standard score 
form (see Schaie, 1962a; Michel, 1959). 

The next scoring step is the deter-« 
mination of the “sequence formula 
which has four digits. The sequence 
formula is obtained by examining the 
occurrence of the different colors over 
the three pyramids in each series. The 
four digits of the formula represent the 
"constant sum" (CS) which is the num- 
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ber of colors which appear in all three 
pyramids, the “sum of minimal change" 
(MiS) is the number of colors appear- 
ing in two of the three pyramids, the 
“sum of maximal change" (MaS) is 
the number of colors appearing in only 
one of the three pyramids, and the 
*avoidance sum" (AS) identifies the 
number of colors constantly avoided. As 
there is a total of 10 colors, the se- 
quence formula 5:2:1:2, for example, 
would mean that 5 colors had appeared 
in all three pyramids, 2 colors in two 
out of three, 1 color only once, and that 
2 colors had been avoided. 

A number of other complex charac- 
teristics are usually computed. They 
are the so-called *color syndromes" and 
involve the computation of the total fre- 
quencies or proportions of incidence for 
certain color combinations. The most 
frequently mentioned ones are: The 
"stimulation syndrome" (Ssyn) which 
includes yellow, orange, and red; the 
“normal syndrome" (Nsyn) including 
red, blue, and green; the “achromatic 
syndrome" (Asyn) including black, 
white, and gray; and the "drive syn- 
drome" (Dsyn) including green, yel- 
low, and brown. The proportion of re- 
sponses utilizing “warm” and “cold” 
colors are also frequently computed. Of 
the 21 chromatic hues, all yellows, 
oranges, and reds, as well as brown;, 
green,, and green, are considered warm 
hues. All purples, blues, browns, greens, 
and green, are considered cold hues. 

While the scoring procedures so far 
described are clearly unambiguous and 
straightforward, the final step concerned 
with the identification of the structural 
properties of the subject’s response in 
laying the pyramid, involves somewhat 
subjective judgment. Scoring rules have 
been specified, however, which should 
lead to high interscorer agreement. ‘The 
major distinctions made about the py- 
ramidal structure are based on the as- 


sumption that the subject may either 
attend to the form qualities of the pyra- 
mid or that he may ignore such quali- 
ties and simply regard the pyramid as 
a flat surface. A distinction is further 
made between orderly and irregular 
structuring with the implicit assump- 
tion that any orderly structure takes 
cognizance of the pyramidal form. Ir- 
regular structures are those where a 
harmonious blending of colors is inter- 
rupted by unsuccessful attempts at 
structuring. 

In terms of the use of color, three 
major structural categories may be de- 
scribed. The subject whose response is 
color dominated will lay a carpetlike de- 
sign of characteristic color scatter. 
Where a need for organization prevails 
but color dominated responses still pre- 
dominate more or less well-ordered 
“layer” designs involving color separa- 
tion will appear. Finally where need 
for organization of the subject’s experi- 
ence predominates the so-called “struc- 
tures” involving color organization will 
appear. Twelve distinct types of pyra- 
mid structure have been described 
(Heiss & Hiltmann, 1951) and may be 
recognized by the following character- 
istics: . 

Patterns Indicating Color Dominated 
Response. No attention is paid to the 
pyramidal aspects of the design nor to 
the center, base, and top of the pyra- 
mid. Colors are scattered so as to 
achieve a harmonious result. This type 
is called a “carpet” (Teppich) and 
several subtypes are recognized. 

1. The pure carpet (Der reine Tep- 
pich). There is complete scatter of col- 
ors and no two patches of the same hue 
are used adjacently. The hues blend 
into a scatter which does not show 
marked brightness contrasts. 

2. The unbalanced carpet (Der un- 
ausgewogene Teppich). This pattern 
contains adjacent pieces of the same or 
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similar hue or shows adjacent pieces 
with marked brightness contrast. An 
example of contrast would be the light- 
est shade of red next to the darkest 
shade of green. The unbalanced carpet 
suggests a lessened attention to color 
without substitution of constructive 
structuring. 

3. The torn carpet (Der zerrissene 
Teppich). Use of white pieces which 
are not blended into a design gives cer- 
tain patterns the appearance of being 
torn or destroyed. One or more white 
pieces which are not part of a design in 
an otherwise typical carpet pattern 
qualifies for assignment to this category. 

4. The structured carpet (Der Tep- 
pich mit Ordnungsansatz). Some at- 
tempt towards structure is made by 

using the same color for the corners or 
the pyramidal axis (top, center, and 
base point). Any other partial attempts 
at symmetry or a single layer of the 
pyramid arranged in similar hues should 
also be placed in this category. 

Patterns Indicating Responses In- 
volving Color Separation. The construc- 
tion of the pyramid in this type of 
design pays attention to the layers or 
rows of the pyramid. There is generally 
a separation from one layer to another. 
The following subtypes are recognized: 

5. The monochromatic layer (Die 
einfarbige Schichtung). This is the 
special case where the whole pyramid 
is either laid in the same hue, or more 
frequently where the layers are obtained 
by using the different hues of a given 
color. 

6. The multichromatic layer (Die 
farbige Schichtung), Every row of the 
pyramid is laid in a single color. Ordi- 
narily adjacent rows will be in different 
colors, but patterns with adjacent rows 
in the same color would also be classi- 
fied here as long as at least two different 
colors are used in the pyramid. 

7. The symmetrical layer (Die sym- 
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metrische Schichtung). This pattern 
has a symmetric arrangement within 
one or more of the layers. The design 
within each layer is independent of and 
not integrated with the symmetric de- 
sign in any other layer of the pyramid. 
The symmetry may be a function of 
different hues of the same color or of 
different colors. It might also be a 
function of brightness; e.g., black - light 
blue -dark purple-light purple - dark 
green. 

8. The structured layer (Die Schich- 
tung mit Strukturtendenz). This is a 
transitional pattern and may be recog- 
nized by the identification of a struc- 
tured design within a pyramid which 
retains the general characteristics (i.e., 
color separation) of the layer type. 

Patterns Which Are Structure Domi- 
nated. These patterns are characterized 
by an arrangement of the colored pieces 
in a specific design within the pyramid. 
There is evidence that the subject has 
recognized the pyramidal form and is 
aware that some rule or orderly princi- 
ple is appropriate for his task. Such 
principles are no longer restricted to à 
separation of colors as in the layer pat- 
terns but are now directed towards de- 
signing an integrated whole. The fol- 
lowing subtypes are recognized: 1 

9. The symmetrical structure (Die 
symmetrische Struktur). This type 1n- 
volves a systematic organization of col- 
ors about the axis of the pyramid in 
contrast to the structured layer, it in- 
volves the entire pyramid without color 
separation among the rows. 

10. The mantle pyramid (Die Man- 
tel-Pyramide), In this pattern the two 
sides and sometimes also the bottom 
row of the pyramid are in the same 
color or a single but different color i5 
used for each of the sides. The core 
of the pyramid may be a single color 
or any color combination. 

11. The asymmetric-dynamic struc- 
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ture (Die asymmetrisch-dynamische 
Struktur). This category includes pyra- 
mids containing monochromatic trian- 
gles or two triangular combinations 
which have identical multichromatic 
components even though they are not 
in symmetric position. This category 
must not be confused with the struc- 
tured carpet or structured layer type. 

12. The staircase structure (Die 
Treppen-Struktur). Layers are ar- 
ranged in ascending order beginning 
from the left or right corner of the pyra- 
mid to form a staircase. Most frequently 
only two colors are used. Use of more 
colors is rare and would represent a 
rotated multichromatic layer. 

Brehmer (1960) derived an index of 
form dominance by combining all types 
of carpets, layers, and structures, and 
assigning scores for any individual rec- 
ord on a seven-point scale as follows: 

1—three carpets 

2—two carpets and one layer or structure 

3—one carpet and two structures 

4—three structures or one carpet, one struc- 

ture and one layer 

5—two structures and one layer 

6—two layers and one carpet or one structure 

7—three layers 


Brehmer assumes in this scale that lay- 
ers are more form dominant than struc- 
tures, an assumption that would proba- 
bly be questioned by other investigators. 


Normative DATA 


The original test manual (Heiss & 
Hiltmann, 1951) presents data on 300 
normal subjects covering a wide age 
range. Means for the pretty pyramids 
are reported in percentage values for 
the colors with no measures of score 
dispersion. Normative data on the color 
syndromes and the sequence formula 
are scattered throughout the manual 
but are difficult to use. 

A much more satisfactory collection 
of norm tables is provided by Becker 
and Karl (1955). Their tables include 
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norms for a sample of 300 adults strati- 
fied by sex, age, occupation, and edu- 
cation according to the 1950 census for 
the Federal Republic of Germany and 
West Berlin. These tables give mean per- 
centages and their standard deviations 
for the colors in both the pretty and 
ugly pyramids. Similar norms are pro- 
vided for 150 boys and 150 girls in high 
school, with the additional indication of 
the range + 1 P.E. Norm tables for the 
pretty pyramids alone, giving mean 
percentages, standard deviations, and 
ranges of + 1 P.E. are provided for 60 
preschool children (aged 2-8-4-2); 68 
6-year-old, 61 10-year-old, 56 12-year- 
old, and 55 13-year-old primary school 
students. Data for another sample of 
120 boys and 113 girls (aged 10-14 
years) in another grade school is also 
given. 

Schaie (1962a) provides norms for 
children. and adolescents based on à 
sample of 650 American public school 
children from kindergarten to Grade 12. 
He provides sten (10-point scale with 
3 sigma intervals) tables separately for 
boys and girls for samples from two 
adjacent grades, each table based on 
an N of 50 subjects. Data for both 
major colors and hues are given for the 
pretty and ugly pyramids. 

Michel (1959) gives similar tables 
for a modified 14-hue CPT set based on 
a T transformation of the raw data for 
a sample of 180 German adults selected 
according to census classifications with 
respect to age, Sex, occupation, and 
education. 

INTERPRETATION OF CPT SCORES 
Color Frequencies 

Pfister (1950) had originally pro- 
posed an intuitive rationale based on 
folklore and clinical experience to inter- 
pret the meaning of high or low pref- 


erences for specific colors. To provide 
a more rational basis for interpretation, 


536 


Heiss and Hiltmann (1951) examined 
the color frequency patterns of normal 
and abnormal groups of subjects. They 
assume that preference or rejection of 
those colors which are preferred or re- 
jected by specific diagnostic groups are 
related to the salient personality char- 
acteristics displayed by members of 
such deviant groups. They assume fur- 
ther that one may legitimately assign 
similar characteristics to individuals 
showing similar CPT color choice pat- 
terns. Since many diagnostic groups 
show overlapping CPT patterns their 
work consists essentially of identifying 
a consistent system of apparent rela- 
tionship. This is again a rather intui- 
tive patchwork, but at least conducted 
with the benefit of some empirical data. 

Heiss (1952) comes closest to giving 
a concise set of interpretive statements 
on the meaning of high or low color 
choices. He describes high red as being 
Characteristic of impulsive affect while 
low red is considered to be concomitant 
with reduced responsivity and affect. 
Orange is considered to be an extra- 
version-introversion indicator, with high 
orange related to the ability to estab- 
lish good interpersonal relations. A 
similar interpretation is given to yel- 
low, except that this color relates to the 
ability to establish objective and ra- 
tionally controlled rather than emo- 
tionally determined interpersonal rela- 
tions, Green is linked to sensitivity and 
internalization of affect. A high green 
may be characteristic of imaginative 
individuals who are well able to inter- 
nalize and sublimate, or it may reveal 
an individual who is being overwhelmed 
by his emotional experience and as a 
result shows symptoms of psychological 
disturbance. A low green in contrast 
is considered characteristic of low sen- 
sitivity and blandness. Blue is repre- 
sentative of cognitive impulse regula- 
tion, with high blue characteristic of 
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individuals using rational modes of re- 
sponse and low blue being associated 
with aimless and poorly regulated be- 
havior. High purple is considered to be 
evidence of anxiety and emotional dis- 
turbance. Brown is related to need de- 
termined negativism. High white is 
characteristic of schizophrenic blandness 
and reduction of inhibitions and high 
black is considered to be evidence of 
depressive repression. 

Seeger (1954) suggests that an inter- 
pretive system derived from the be- 
havior of abnormal groups cannot be 
very satisfactory in normal personality 
functioning. This criticism has been 
met in part by Spreen (1955) who tried 
to determine the meaning of the achro- 
matic colors for a sample of 100 normal 
adults by comparing their CPT patterns 
with those of abnormal groups. He 
agrees generally with Heiss’ descriptions 
but much of his discussion appears to 
be purely speculative. 

Schaie (1962c) in an attempt to vali- 
date Heiss’ descriptions examined the 
patterns of teacher trait ratings assigned 
to subjects with low and high values 
for the various colors. He finds that 
although several of Heiss’ propositions 
appear valid, it is necessary to devise 
separate interpretive systems for male 
and female subjects and that important 
variations from Heiss’ suggested infer- 
ences appear at least for normal chil- 
dren and adolescents. 


Color Syndromes 


Hiltmann and Heiss (1950) suggest 
further inferences to be drawn from the 
color combinations which they denote 
as color syndromes (see section on scor- 
ing above). They suggest that Nsyn 
which includes the colors most fre- 
quently used by normal subjects gives 
an indication of general psychic balance. 
A high Nsyn would go with overcon- 
trolled or constricted personalities while 
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a low Nsyn would suggest loosening of 
controls and fluidity of defenses. The 
Ssyn is supposed to give an indication of 
the individual's mood state, with a high 
Ssyn being indicative of elation. The 
Dsyn is said to be positively related to 
the individual's energy level (see Karl, 
1953) and Asyn is considered evidence 
of denial and withdrawal tendencies. 


Sequence Formula 


This set of scores is considered to re- 
veal some of the most stable informa- 
tion on the individual's personality 
structure. Heiss and Hiltmann (1951) 
suggest that the total number of differ- 
ent colors used gives an indication of 
the breadth or narrowness of the indi- 
vidual's response repertory, the middle 
digits of the formula (MiS and MaS) 
give an indication of the continuity or 
discontinuity of the individual'S ca- 
pacity to respond to different types of 
stimuli, and the last digit of the for- 
mula (AS) gives the individual’s tend- 
ency towards response avoidance. They 
further present a scheme which permits 
making statements from the sequence 
formula about constriction and lability, 
flexibility and anxiety, as well as the 
stability of the individual’s personality 
pattern, An analysis of the various 
types of sequence formulas has been 
made by Wewetzer (1954) and a fur- 
ther contribution by Hiltmann (1954) 
attempts to integrate interpretive state- 
ments made on the basis of both color 
choices and the sequence formula. 


Pyramid Structure 


Wewetzer (1951b) proposes that the 
patterning of the pyramids will yield 
information on structural aspects of 
personality. He suggests that carpet 
types, indicating color dominance, may 
be characteristic of a labile personality 
structure as seen in young children but 
when a successful blending of colors is 
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achieved may also suggest creative flexi- 
bility. He distinguishes special subtypes 
of which the unbalanced carpet with 
darkness-lightness contrast is deemed 
characteristic of the prepubertal per- 
sonality, the torn carpet with its white 
patches of personality disturbance of a 
schizophrenic type, and the structured 
carpet is a transitional pattern, fre- 
quently seen in adolescents, indicating 
increasing stabilization of personality 
structure, 

Another set of patterns involves color 
separation. The most striking type is 
that of the monochromatic pyramid, 
rarely seen in normals and is deemed 
characteristic of developmental retarda- 
tion or pathological constriction. The 
multichromatic layer shown by many 
children and adolescents is character- 
istic of an incompletely differentiated 
personality structure, the symmetric 
layer still suggests some lability which 
is being hemmed in by caution and 
timidity, and the structured layer is 
another transitional type towards in- 
creasing differentiation of personality 
structure. 

Among the structure-dominated pat- 
terns, Wewetzer distinguishes the sym- 
metric structure which is characteristic 
of a stable well-differentiated person- 
ality, the mantle pyramid which is 
deemed characteristic of neurotic repres- 
sion and denial, the asymmetric-dy- 
namic structure which suggests a well- 
differentiated and flexible but also 
relatively vulnerable personality, and 
the staircase structure related to a well- 
differentiated but conflict dominated 
pattern. 

Ugly Pyramids 

Heiss, Honsberg, and Karl (1955) 
extend the CPT by adding to the con- 
ventional instruction that the subject is 


to form three “pretty” pyramids, a de- 
mand for a further three pyramids with 
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the instruction to make them as “ugly” 
as possible. Several hypotheses for an 
interpretation of the results of the sec- 
ond series are given in their paper. They 
suggest that the unconscious aspects of 
personality may be more adequately 
portrayed under the ugly condition. 
Another hypothesis suggests that the 
pretty pyramids reveal the "actual" 
aspects of personality while the ugly 
pyramids yield insights about the “la- 
tent" facets or potentialities, Similarly, 
Karl (1956) argues that the pretty 
pyramids indicate behavioral modes 
which the subject in fact utilizes, while 
the ugly pyramids refer to those be- 
haviors which may be available but are 
low in the response hierarchy. These 
authors suggest further that test results 
for both series must be interpreted in 
conjunction with one another. Those 
variables receiving similar scores under 
both conditions may refer to the stable 
aspects of personality, while discrepan- 
cies may point to areas of conflict. 
Karl specifically proposes that one ought 
to use the average values for both series 
in making inferences about the test 
results, 


RELIABILITY OF THE CPT 


Most of the reported reliability stud- 
ies use test-retest methods although the 
structure of the test lends itself readily 
to internal consistency analyses. The 
resulting reliability coefficients seem no 
better or worse than those given for 
most other instruments in the area of 
personality measurement with average 
values at about the .60 level. As will 
become apparent by examining some of 
the pertinent studies, reliabilities vary 
considerably for the several scoring 
variables as well as depending upon the 
particular conditions of experimental 
situations. 

Test-Retest Reliability. Jolas (1953) 
administered the CPT to 50 adults re- 
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testing after an interval of 4 weeks, Re- 
liability coefficients are reported for the 
pretty pyramids ranging from .47 to 
.81 with a mean coefficient of .61, Re- 
liabilities for the individual hues ranged 
from .33 to .72 with a mean of .56. 
Coefficients for the sequence formula 
ranged from .57 to .83. 

Pflanz (1954) readministered the 
CPT under nonstandard conditions to 
50 hospital patients who were placebo 
subjects in a drug experiment. He re- 
ports reliabilities for the various hues 
ranging from .14 to .61. Reliabilities 
for CS and AS of the sequence formula 
are reported as .52 and .46, respectively. 

Brehmer (1960) gave the CPT to 
45 male and 63 female Swedish exten- 
sion students with retesting after 5 
weeks, She obtained average reliability 
coefficients for the 10 colors of .41 for 
her male and .55 for her female sub- 
jects, Reliabilities reported for the CS 
and AS of the sequence formula are .58 
and .47 for male and .52 and .58 for 
female subjects. Reliabilities are also 
reported for the identification of the 
structural properties of the pyramids, 
as well as for Brehmer’s index of form 
dominance, which have a magnitude of 
.62 and .66 for male subjects and .77 
and .52 for female subjects. 

Internal Consistency. Since each ex- 
perimental series contains three trials 
(pyramids) it is possible to use the 
analysis of variance to get an estimate 
of internal consistency. One may partial 
out the components of variance 4550- 
ciated with the scoring categories (col- 
ors and instructions) and individual 
differences and thus estimate the pro- 
portion of "reliable" variation. Schaie 
(1962a) administered the CPT to 8 
group of 43 delinquent girls in a state 
training school. He used the analysis 
of variance to estimate internal con- 
sistency in the above manner and ob- 
tained an overall coefficient of .74. 
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Reliability of CPT Modifications. 
Some investigators have attempted to 
increase the reliability of the CPT by 
suggesting modifications of administra- 
tion and scoring procedures. In particu- 
lar, the effect of the reduction of the 
number of hues offered the subject has 
been investigated in several studies. 
Michel (1959) constructed a 14-hue 
set by removing two of the reds, blues, 
and greens and one of the purples and 
browns. He reports reliabilities on a 
group of 45 normal adults with retesting 
after 5 weeks. His values range from 
.54 to .83 for the pretty pyramids, from 
.60 to .79 for the ugly pyramids, and 
from .58 to .76 for the values of the 
sequence formula. 

Reinert (1958) retested 100 normal 
adults after an interval of 4 weeks with 
a 10-hue set retaining only 1 hue for 
each color. His reliabilities range from 
.63 to .78 for the pretty pyramids and 
.62 to .76 for ugly pyramids. The coef- 
ficients for the sequence formula ranged 
from .61 to .69. 

Apparently reliability can be in- 
creased by reduction of the stimulus 
complexity. Whether such reduction is 
accompanied by information loss is as 
yet to be ascertained. It seems clear, 
however, that other modifications of 
the CPT might well merit study (see 
Luthe & Salman, 1953). 


VALIDITY OF THE CPT 


Most of the validity studies to be re- 
ported are concerned with the power of 
the CPT in distinguishing pathological 
from normal groups of subjects and as 
a device for differential diagnosis among 
different pathological categories. A 
limited number of studies is also con- 
cerned with validating the CPT as a 
device for the description of personality 
dynamics or relating particular scoring 
variables to other observable behavior 
characteristics. Part of this literature 
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is reminiscent of the discussions on the 
possibility of validating projective tech- 
niques (Ebermann, 1955; Seeger, 1954) 
appearing in American journals in the 
late thirties and will therefore be ig- 
nored or mentioned in jpg. Since 
many of the interpretive statements pro- 
posed by Heiss (1952) are based on 
group characteristics, evidence relating 
to the test's use as a tool in differential 
diagnosis will first be considered. Much 
of the literature to be reviewed was pub- 
lished before the addition of the ugly 
pyramid instructions (Heiss et al., 
1955). Reported results on color choice 
therefore refer to performance on the 
three pretty pyramids unless otherwise 
indicated. 


Differences between Normal and Abnor- 
mal Subjects 


Wewetzer (1951a) examined a group 
of 90 schizophrenics, 25 manic-depres- 
sives, and 70 epileptics as compared 
with 100 normal controls. He found 
displacement of the color hierarchy for 
the abnormals and suggests a scheme 
of “rising” and “falling” colors which 
will distinguish normal from abnormal 
CPT records. The rising colors were 
those which in abnormal subjects at- 
tained a higher rank ordering than in 
normal subjects (purple, green, white, 
gray, and brown), while the falling col- 
ors were those which attained a lower 
rank in abnormal than in normal sub- 
jects (yellow, blue, red, and black). 
Brengelmann (1957) presents evidence 
supporting this hypothesis for a group 
of depressives who show means in 
the expected directions significant at 
the 5% level of confidence. Schizo- 
phrenics, however, could not be dif- 
ferentiated by the above criterion. 
Wewetzer also concluded that the ab- 
normal group as a whole demonstrated 
greater choice frequencies for green and 
purple, a significantly lower Ssyn (red- 
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orange-yellow), and a more frequent 
choice of white and gray. The incidence 
of structuring was much lower for the 
abnormals with color domination as 
much as three times as frequent as in 


normals.bgt with about equal empha- 
sis on color ‘separation. Brengelmann 


1953a), using a sample of 30 nurses 
and 30 soldiers for his normals, and 30 
neurotics, 30 depressives, and 30 schizo- 
phrenics for his abnormal subjects, 
confirms increased choice of purple for 
his abnormal subjects as well as an in- 
creased choice of brown and a lowered 
choice of blue. An examination of the 
hypothesis of lessened structuring for 
abnormals led to its rejection (Brengel- 
mann, 1953b). Heiss, Karl, and Wewet- 
zer (1953) on re-examination of Bren- 
gelmann’s data, however, claim that 
his data in fact supports the hypothesis. 
Their criticism seems justified, particu- 
larly as several other studies with ab- 
normal groups lend further support to 
the notion that lessened structuring 
with accompanying color dominance 
seems characteristic of abnormal CPT 
response (see Frohoff, 1953; Seyfried, 
1957). 

O'Reilly and Blewett (1959) admin- 
istered the CPT to 43 schizophrenics, 
22 nonschizophrenic psychiatric pa- 
tients, and 48 normal control subjects. 
Their findings indicate that nonschizo- 
phrenic psychiatric patients show sig- 
nificantly greater preference for purple 
than do normals but that schizophrenics 
and normals cannot be differentiated by 
this criterion. On the other hand the 
whole group of abnormal subjects showed 
significantly higher preference for orange 
while the schizophrenics showed higher 
preference for white. Nonschizophrenic 
psychiatric patients also showed greater 
preference for yellow and brown. An 
examination of the validity of Nsyn 
(red-green-blue) as a differentiating 
characteristic gave negative results. 
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Another differentiating characteristic 
which has received some attentio 
concerned with the subjects' deviation 
from the expected frequency of the dis- 
tribution of hues for each color. 
rad (1954) names this distribution the 
“standard probability" of CPT response 
and argues that a random choice of. 
colors would result in an approximation: 
to the standard probability. He sho 
that the characteristic behavior of 
abnormal subjects (100 schizoph 
and 50 depressives) was a regression 
their mean scores to the expected fi 
quency, while an examination of t 
means of normal subjects in Heiss’ 
sample (1951) indicated that these sub- 
jects showed quite significant deviations: 
from expected frequencies. Conrad in- 
terprets these findings as implying that 
subjects with behavior pathology no 
longer select colors but choose them ran- 
domly. An increase in the frequency - 
of carpet patterns shown in records of 
abnormal subjects which is typically ac- 
companied by greater randomness Of - 
color choice is also cited as evidence 
for Conrad's hypothesis. Brengelmann - 
(1955) argues that re-examination of ~ 
the evidence presented by Conrad does 
not support the hypothesis of greater 
deviation from “standard probability” 
by normal subjects. In a later study 
(Brengelmann, 1957), this author re- 
ports no significant differences between 
normal and abnormal subjects with @ 
trend for schizophrenics to show higher 
deviation than any other group. 

Brengelmann (1957) also investigated. 
the variability of color choice as à diiz 
ferentiating criterion between groups of 
normal and abnormal subjects. Varia- 
bility is found to be greater for a 
mal subjects but an interaction W 
the specific color involved was noti 
Thus, neurotics are found to be more” 
variable than depressives on blue, red, 
brown, and gray; normals more variabie 
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than depressives on green, yellow, pur- 
ple, gray, and white; schizophrenics 
more variable than normals on blue, 
red, yellow, purple, gray, and white; 
neurotics more variable than depres- 
sives on blue, red, black, gray, and 
white; and depressives more variable 
than neurotics on orange. Individual 
variablilty from empirical norms was 
also investigated. B com- 
puted averages for a random half of his 
contro! group, then computed deviations 
for the other half and the abnormal 
groups in his study. Results for overall 
variability show significant differences 
with normals being less variable than all 
the abnormal groups. 

While the evidence is somewhat con- 
fusing, fair agreement seems to obtain 
that abnormal subjects produce pyra- 
mids with less structure and that high 
preference for brown, purple, and white 
may be taken as possibly pathognomic 
indicators. 


Schizophrenics 


Wewetzer (1951a) studied 90 schizo- 
phrenics and concluded that in this 
group green assumes first preference in 
rank order, He also found this group 
to give greater preference to purple, 
brown, white, and gray. He also found 
substantial differences in pyramid pat- 
terning. Sixty-one percent of his sub- 
jects had color dominated patterns (as 
compared to 19% of Heiss and Hilt- 
mann's normal subjects while only 
18.4% had structure dominated patterns 
(52% of normal records). Of particu- 
lar interest was his finding that almost 
a third of the schizophrenic records 
were of the “torn” pyramid type, ie. 
white had been used without being 
blended into a design or color mixture. 

The incidence of white as a pathog- 
nomic sign of schizophrenia has also 
been reported by O'Reilly, Holzinger, 
and Blewett (1957). These investiga- 
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tors administered the CPT under non- 
standard conditions to 43 schizophren- 
ics, 25 nonschizophrenic psychiatric 
patients, and 48 student nurses. They 
found white being chosen at least once 
by 76.7% of the schizophrenic, but only 
by 8% of the nonschizophrenic patients, 
and by 29.1% of the nurses, They also 
report the incidence of torn pyramids 
to be twice as great among the schizo- 
phrenics as among the normal controls. 
- Frohoff (1953) administered the CPT 
to a group of 100 schizophrenics (37 
undifferentiated, 23 paranoid, 20 hebe- 
phrenic, 14 catatonic, and 6 schizoaffec- 
tive subjects; of the total group 50 
were male and SO female). He finds 
that green is given first preference by 
his male but not by his female subjects, 
but does not increase in preference by 
a significant amount. However, he also 
reports significant increments for pur- 
ple, gray, and white, with a correspond- 
ing lowering for yellow and orange. 
Frohoff's data verify the increased color 
dominance in patterning for schizo- 
phrenics, with a more intense shift for 
male than female subjects. He does 
not find evidence for the incidence of 
torn pyramids but since his criteria for 
identifying these patterns differ from 
those used by Wewetzer, their material 
does not seem directly comparable for 
this characteristic. Catatonics showed 
icular loss of structure and there 
were some other significant differences 
between subgroup means which appear 
to be related to differential changes of 
affect. Because of the small frequencies 
in the subgroups, caution is advocated 
as to the recognition of pathognomic 
signs but data for the subgroups are 
rted in the original paper. 

Sacher (1955) was particularly im- 
pressed by reported clinical incidence 
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only 15 hues with oversized chips and 
pyramid pattern. He obtained a greater 
number of monochromatic pyramids in 
a group of so-called apathetic-autistic 
hebephrenics, with as many as 6896 of 
his 44 subjects constructing monochro- 
matic pyramids. He argues that no 
similar observations have been reported 
on other patients with schizoid disturb- 
ances and offers the incidence of such 
patterns as a criterion for differentiating 
among schizophrenics, Several other 
studies employing schizophrenic sub- 
jects resulting in negative or con- 
founded results have already been 
mentioned in the section on differentia- 
tion between normal and abnormal 
subjects (Brengelmann, 1955a, 1955b, 
1957; Conrad, 1954). A detailed dis- 
cussion of the individual protocols of 
eight schizophrenic patients is given by 
Becker (1955). £ 


Manic-Depressives 


Wewetzer's study (1951a) had a 
group of 25 manic-depressives whose 
CPT records were characterized by a 
greater number of color dominated rec- 
ords than his normals but less than 
the schizophrenic subjects. The manic- 
depressive group also showed increased 
preference for purple, brown, and gray, 
with lowered preference for red, yellow, 
and orange. Brengelmann (1957) also 
found a lowered yellow for his 36 
depressives. 


Epileptics 


Wewetzer (1951a) examined 70 epi- 
leptics and indicated that this group 
showed the greatest avoidance of struc- 
tured patterning. Only 10.8% of the 
pyramids produced by this group showed 
structure, 61.4% (about equal to the 
schizophrenic group) had color domi- 
nated pyramids, the remainder (about 
equal to the normals) showing color 
separation. The color preference pattern 
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of this group was very similar to the 
manic-depressives, except that there 
was an apparently significantly greater 
preference for green and lower prefer- 


ence for yellow. 
Mental Defectives 


Spreen (1951a) had a few defectives 
in a sample of abnormal youngsters. 
He reports the incidence of monochro- 
matic pyramids and an increase in the 
preference for yellow and purple. Schaie 
(1962b) examined a group of 29 boys 
and 29 girls who were patients in a state 
home for mental defectives. He found 
a number of significant differences on 
color choice when he compared his sub- 
jects with data on normal children 
(Schaie, 1962a). It was postulated that 
only those differences were related to his 
subjects’ mental defect which would 
vanish when the subjects’ raw scores 
were re-entered in the norms at ages 
corresponding to their Stanford-Binet 
MA. Differences significant at the 1% 
level showing this characteristic were 
found to be low blue and high brown 
for the male subjects, and high orange 
and high purple for the female subjects. 
Matching his defectives with normal 
subjects by chronological age and speci- 
fying optimal cutoff points for these . 
criteria Schaie was able to classify cor- 
rectly 78% of his 116 test records. 


Neurotics 


Spreen (1951b) reports records on 
18 neurotic psychotherapy patients who 
show high preference for red and black 
as well as the incidence of “mantle” 
patterns rarely seen in normal subjects. 
Brengelmann's (1957) 32 neurotic sub- 
jects had a significantly higher prefer- 
ence for brown and significantly lower 
preference for yellow than his normal 
controls. Kloska (1958) examined the 
records of 25 neurotic subjects who as 
compared with the Heiss and Hiltmann 
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norms show increased preference for 
purple and black and decreased prefer- 
ence for yellow. This author next se- 
lected the three most typical subjects, 
in each of four diagnostic groups 
(schizoids, depressives, compulsives, and 
hysterics), took their average profile as 
well as a normal average profile, and 
computed Q correlations which were all 
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ings on Factor II. 

Ziolko (1956) administered the test 
repeatedly to 35 female neurotics in in- 
patient psychotherapy. Their progress 
was marked by increasing preference 
for black, an increase in the frequency 
of monochromatic pyramids with an 
accompanying reduction in the number 
of different colors used. He gives sev- 
eral case histories showing disintegration 
of form quality followed by reintegra- 
tion during the therapy process. 


Character Disorders 


Seyfried (1957) compared 70 insti- 
tutionalized delinquent boys with a 
group of 45 normals. Both groups 
ranged in age from 7 to 16 years with 
a mean age of 11. The only significant 
difference in the color choice of the two 
groups appeared in the delinquents’ 
greater preference of purple. Seyfried 
agrees with Conrad's hypothesis (1954) 
that structure tends to influence color 
choice and conducts separate frequency 
analyses for subjects whose 
have predominant “carpet,” “layer,” or 
“structure” characteristics. He shows 
that subjects with structures in both 
control and experimental groups devi- 


Siedow (1958) examined 330 patients 
in a psychiatric hospital who had been 
committed primarily because of conflicts 
with society. Sixty-eight of his subjects 
were asocial psychopaths, 56 female 
sexual delinquents (mostly prostitutes) , 
75 were men who committed sexual 
assault on children, and 131 were 
chronic alcoholics. Results for both 
pretty and ugly pyramids were compared 
with norms for normal subjects (Becker 
& Karl, 1955). As compared with nor- 
mals, the total experimental group was 
significantly higher on orange and gray 
and significantly lower on green for the 
pretty pyramids; orange was signifi- 
cantly higher and green significantly 
lower on the ugly pyramids, In 
addition to these differences, the psy- 
chopaths were also higher on ugly 
black; the female sexual delinquents 
were higher on yellow and blue for the 
pretty pyramids; and the male sexual 
delinquents showed greatest preference 
for blue and yellow on the pretty and 
for gray on the ugly pyramids. The 
alcoholics did not show the increased 
gray preference of the other groups, but 


did show increased preference for 
pretty blue. 
Schaie (unpublished) studied a 


group of 43 girls (mean age=16.2; 
SD=1.2 years) committed to a state 
home for female juvenile delinquents. 
Their CPT scores were compared with 
normative data on normal adolescents 
(Schaie, 1962a). Differences between 
delinquents and normals significant at 
the 5% level of confidence were found. 
The delinquents were significantly 
higher on purple and black and signifi- 
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cantly lower than normals on gray for 
the pretty pyramids; on the ugly pyra- 
mids the delinquents were significantly 
higher on brown, white, and gray and 
signiücantly lower on red and orange. 


Normal Personality Characteristics 


Although most CPT validity studies 
have so far been concerned with demon- 
strating the ability of the technique to 
differentiate between normals and/or 
psychiatric classification, an equally 
valuable application would be that of 
personality description in normal (and 
abnormal) subjects. One of the first 
studies of this type was an investigation 
by Conrad and Juncker (1954) de- 
signed to investigate CPT correlates of 
differences in constitutional types. The 
investigators examined the CPT proto- 
cols of 40 asthenics and 40 pyknics 
(Kretschmer) with an equal number of 
men and women in each group. The 
only difference in color choice frequen- 
cies was a significantly higher purple 
for the asthenics. However, significant 
differences also appeared in the sequence 
formula, with pyknics having a higher 
Sum of Change and asthenics showing 
a higher CS. The asthenics also showed 
more frequent use of the carpet type 
pattern while pyknic subjects more fre- 
quently used the layer type pattern. 

Karl (1953) was interested in relat- 
ing the CPT to drive structure. He 
used the Pauli Performance Test to se- 
lect from an initial sample of 200 nor- 
mal subjects (mean age: 23 years) two 
groups of 50 subjects each character- 
ized by relatively high or low drive 
levels. He defines a Dsyn consisting of 
the colors brown, green, and yellow, the 
level of which tended to differentiate 
between the two groups. Thus a high 
Dsyn was characteristic of individuals 
with high drive level. Subjects with low 
drive level also showed an increase in 
Nsyn (red-green-blue). 
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Pfizer, Toennesmann, and Weikert 
(1955) examined three groups of young - 
children (90 kindergarten » 
aged 2-8-4-2; 68 6-year-olds; and 70 
primary graders aged 6-3-8-11). There 
was some question whether the test 
could be validly used with their very - 
young subjects since color dominated 
pyramids predominated. They argue, 
however, that the differentiation of the 
pyramid was concomitant with the de- 
velopment of increasingly adequate be- — 
havior as observed by the kindergarten 
teachers. In their second group they 
distinguished on the basis of a school 
readiness test between the more- and 
less-developed children. It was found 
that again the less adequately developed — 
children had a greater incidence of car- 
pet patterns. Data for the older chil- 
dren showed a trend in the direction of 
adult norms. 

Brehmer (1960) correlated the CPT 
scores of 78 male and 98 female young 
adult extension students with trait rat- 
ings by small groups of fellow students 
on 20 seven-point trait scales. 
significant correlations were obtained, 
but all ranged below .30. She does not 
give any information on how well the 
raters were familiar with their ratees 
and the rating variables selected may 
have been somewhat arbitrary. 

Schaie (1962c) administered the 
CPT to 650 public school students 
equally distributed for sex and grade 
from kindergarten to Grade 12. Each 
subject was rated by his home room 
teacher on the 42 variables of Cattell’s 
(1957) “normal trait sphere" using à 
three-point scale. The analysis of vari- 
ance was used for each trait to identify 
those CPT variables showing significant 
differences for subjects in the three rat- 
ing categories. Trait ratings were also 
combined into 15 rating factor scores 
using Cattell's factor weights and these 
latter scores were correlated with the 


THE COLOR PYRAMID TEST 


CPT scoring variables. t re- 
lations were found with almost all of 
the traits, but only few significant rela- 
tions with low coefficients could be es- 
tablished with the factor scores. He 
concluded that the CPT variables as 
conventionally scored seem to relate to 
surface rather than source traits. 


Factor Analysis Studies 


Two factor analytic studies of the 
24 hues under pretty pyramid instruc- 
tions are available. Both these studies, 
however, based correlations on the ab- 
solute choice of colors (ignoring fre- 
quency counts) and therefore com- 
puted tetrachoric correlation coefficients. 
Rainio and Matikainen (1954) factored 
correlations among the form qualities 
and colors chosen on the third pyramid 
from the protocols of 115 Finnish indus- 
trial foremen. They obtained eight fac- 
tors which were related to brightness, 
contrasts of light and dark, saturati 
of color and structural characteristics. 
A second analysis of the 24 hues based 
on all three pyramids conducted by Karl 
has not been published but is described 
by Reinert (1958). The subjects for 
this analysis were 250 German male 
workers (aged 17-22). Good agreement 
with the earlier analysis is claimed. 
Reinert tried to define new color group- 
ings on the basis of these analyses by 
computing the sums of differences be- 
tween rows of the rotated factor matrix 
and grouping those variables having 
minimal differences. As a result of this 
procedure he arrived at five groupings. 
Red, (pink) becomes à singlet; the two 
browns form the second group; black, 
green, red, reds, reds, violet; and vio- 
let, form another. The fourth grouping 
consists of the blues, purple; (the light- 
est purple), gray, and white. The yel- 
lows, oranges, and greens (except for 
the darkest green) form the last group. 
These results are reasonable but the 
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methods used would appear suspect to 
many factor analysts. It need also be 
kept in mind that neither analysis 
handles the scores as they are computed 
for clinical interpretation. Further fac- 
tor analyses would certainly appear de- 
sirable using typical scoring procedures 
and more conventional methods of 
analysis. 
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In this review of the psychological study of creativity there are 4 em- 
phases; products, process, measurement, and personality. 3 main issues 


concern questions oi: definition and criteria, the process viewed tem- E 


porally, and necessary personal and environmental conditions. The rela- 


tionship between creativity and intelligence is discussed to illustrate the & 


need for conceptual reorganization 
should now be able to utilize person: 


as well as correlational data. We X 
ality and stylistic modes as criterion 


variables and to study how these factors are related at different age 
levels to behavior that is judged to be creative. This approach holds 
promise for providing a functional, developmental understanding of 


creativity. 


* 

The purpose of the present paper is to 
review recent theory and research per- 
taining to the psychological study of 
creativity so as to highlight the issues 
and emphases reflected in the literature. 
Three issues are apparent: (a) What is 
creativity?—questions of definition and 
criteria, (b) How does creativity occur? 
—questions of the process viewed tem- 
porally, and, (c) Under what conditions 
is creativity manifest?—Qquestions of 
necessary personal and environmental 
conditions. A striking feature of the 
literature on creativity is the diversity 
of interests, motives, and approaches 
characteristic of the many investigators. 
Creativity has been viewed as a normall; 
distributed trait, an aptitude trait, an 
intrapsychic process, and as a style of 
life. It has been described as that which 
is seen in all children, but few adults. It 
has been described as that which leads 
to innovation in science, performance in 
fine arts, or new thoughts. Creativity 
has been described as related to, or 
equatable with, intelligence, productiv- 
ity, positive mental health, and origi- 
nality. It has been described as being 
caused by self-actualization and by sub- 


1 The author would like to thank Herbert 
Crovitz, Veterans Administration Hospital, 
Durham, North Carolina; Charles D. Spiel- 
berger, Vanderbilt University; and George S. 
Welsh, University of North Carolina, for their 
suggestions and comments. 
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limation and restitution of destructive 
impulses. Clearly there is a need for 
organization and integration within the 
psychological study of creativity. What 
are the many investigators studying? 
How are they studying it? Four con- 
temporary emphases are apparent: prod- 
ucts, process, measurement, and per- 
sonality. The organization herein will 
follow this same order. The scope of 
the paper precludes an exhaustive pres- 
entation of all theoretical statements 
and research reports. The reader is re- 
ferred to an annotated bibliographical 
volume prepared by Stein and Heinze 
(1960). French and Italian bibliog- 
raphies (Bédard 1959, 1960) are also 
available. 


EMPHASIS ON PRODUCTS 


The use of products as criteria of 
creativity is most frequently encoun- 
tered in investigations in technological 
or industrial settings. In such studies 
creativity is assumed to be a unitary oF 
multifaceted trait which is distributed in 
the population in a manner comparable 
to other intellective or personality traits 
(see Gamble, 1959, p. 292). Several 
authors believe that creativity can best 
be studied through products. : 

In the *Committee Report on Criteria 
of Creativity" (Gamble, 1959), it was 
stated that the products of creative 
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behavior should be the first object of 
study. After the products are judged 
“creative” the term can be applied to 
the behavior which produced them, and 
also the individuals who performed the 
behavior can be classed as possessing to 
some degree the trait of creativity. 

Several possible product criteria of 
creativity were proposed by McPherson 
(1956) who reviewed the problem of 
determining “inventive level" of patents. 
Ghiselin (1958) stated that the ap- 
proach outlined by McPherson would 
not provide the true criteria of creativity 
and distinguished two levels of crea- 
tivity. A higher level of creativity intro- 
duces some new element of meaning or 
some new order of significance while a 
lower level gives further development to 
an established body of meaning by initi- 
ating some advance in its use. 

While the utility of studying creativ- 
ity through products remains an issue, 
Harmon (1958), Taylor (1958), and C. 
Taylor (1959) have studied relation- 
ships between criterion variables and de- 
terminants of judges’ creativity ratings. 
Harmon reported correlations of .61 and 
.76 between judged creativity and num- 
ber of publications. D. Taylor reported 
a correlation of .69 between ratings of 
creativity and ratings of productivity 
given by supervisors of research per- 
sonnel. In later reports, Taylor (1960, 
1961) argued that distinctions among 
problem solving, decision making, and 
creative thinking can best be made in 
terms of the product. A large number of 
measures were refined by C. Taylor to 
yield 56 scores on each of a group of 
research scientists. Included in the re- 
fined measures were supervisor, peer, 
examiner, and self-evaluations; counts 
of reports and publications; official 
records; and membership in professional 
societies. Factor analysis yielded 27 
factors. The finding that among the 
many correlations four out of any five 
variables were independent of a given 
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criterion was cited as evidence for the 
“almost overwhelming complexity of the 
criterion problem.” 


EMPHASIS ON PROCESS 
Creative Process and Illumination 


An alternative to the study of crea- 
tivity through products is to study the 
process of creativity. Wallas (1926) 
described the stages of forming a new 
thought as follows: preparation, incuba- 
tion, illumination, and verification. 
While the four stages could be distin- 
guished from one another, Wallas noted 
that they do not occur in an uninter- 
rupted problem and solution sequence. 
Controversy has appeared concerning 
the distinctness of the stages and the 
relative importance of conscious or other 
modes of mental activity. 

Dashiell (1931) noted the four stages 
of the creative process and related in- 
spiration to insight in learning. Recall 
is dependent on the absence of interfer- 
ing associations set up by excessive con- 
centration on the recalling. Similarly, 
Woodworth (1954) stated that incuba- 
tion implies a theory he prefers not to 
accept. Illumination, he believed is the 
result of laying aside a problem, giving 
the mind a chance to rest and at the 
same time to get rid of false sets and 
directions. Relating the recall of a for- 
gotten name to creative insight, Wood- 
worth stated that the sudden recall of a 
forgotten name after previous futile at- 
tempts suggests that an essential factor 
in illumination is the absence of inter- 
ferences which block progress during the 
preliminary stage. . 

In addition to considering explana- 
tions of unconscious processes and the 
weakening of erroneous sets, Crutchfield 
(1961) suggested that incubation may 
permit, perhaps unaware to the individ- 
ual, new and better cues from the en- 
vironment and from ideation to develop 
while one engages in other activities, An 
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experiment cited suggested that the sub- 
ject's performance on a former task may 
facilitate insight on a later task even 
though they report no awareness of the 
relevant cue present in the preceding 
task. Instead of the study of distinct 
stages, Crutchfield recommended a func- 
tional analysis which would seek lawful 
accounts of the manner in which each 
step of the creative thinking process was 
functionally determined by prior steps 
and in turn governed succeeding steps. 

Ghiselin (1956) described insight as 
the crucial action of the mind in crea- 
tion. He too preferred to consider the 
creative process as consisting of fewer 
discrete steps, and stated that no sort 
of calculation from known grounds will 
suffice for creative production. Required 
for creativity is a fresh formulation, 
rather than copying with variations or 
elaborations. Although he believed that 
concepts of unconscious thoughts are 
imprecise, he did admit the importance 
of diversion which is conceptualized as 
being related to what he terms precon- 
figurative consciousness, 

The illumination controversy could be 
enlarged upon and conceptuall up- 
dated. However, the guidelines seem 
clear in regard to the creative process. 
Crutchfield's paper is helpful in that he 
attempted to translate the somewhat 
literary descriptions of the creative proc- 
ess into better conceptualized psycho- 
logical variables. 


Creative Process: Systematized, Goal 
Directed, or Plastic 


For Harmon (1956) the creative 
process is any process by which some- 
thing new is produced: an idea or object, 
a new form or arrangement of old ele- 
ments. The essential requirement is 
that the new creation must contribute 
to the solution of a problem. The crea- 
tive process is goal directed. Harris 
(1959) saw the creative process as con- 
sisting of six steps: (a) realizing the 
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need, (b) gathering information, ( 
thinking through, (d) imagining 
tions, (e) verifying and (f) putting the 
ideas to work. He stated that the dif- 
ference between the electrified or illu- 
minated minds of some geniuses and the: 
processes in ordinary people is the s 
with which they proceed from Step a to 
Step d (see also Arnold, 1959). 
Taking a different view, I. Ta 
(1959) stated that the rules of logic and 
scientific method are a psychological 
straight jacket for creative thought. He 
proposed five levels of creativity which 
he identified by the analysis of over 100 
definitions of creativity. , 
Expressive creativity is most funda- 
mental, according to Taylor, involving 
independent expression where skills, 
originality, and the quality of the prod- 
uct are unimportant. Spontaneity and 
freedom are apparent from which later 
creative talents develop. Individuals 
proceed from the expressive to the pro- 
ductive level of creativity when skills 
are developed to produce finished works. 
The product is creative in that a new 
level of accomplishment is reached by 
the person though the product may not 
be stylistically discernible from the work 
of others. Inventive creativity is opera- 
tive when ingenuity is displayed. Tl 
level involves flexibility in perceiving 
new and unusual relationships between 
previously separate parts. It does not 
contribute to new ideas but to new uses 
of old parts. Innovative creativity 
requires strong abstract conceptualizing 
skill and is seen when basic foundation 
principles are sufficiently understood So 
as to allow improvement through modifi- 
cation. The highest form of creativity 
is “emergentive creativity,” which in- 
volves the conception of an entirely new 
principle at a most fundamental i 
abstract level. The core of the creative 
process in Taylor’s view is the ability 
to mold experiences into new and dif- 
ferent organizations, the ability to per- 
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ceive the environment plastically, and 
to communicate the resulting unique ex- 
to others. 

Stein (1956) stated that creativity is 
a process of hypothesis formulation, 
hypothesis testing, and the communica- 
tion of results which are the resultant 
of social transaction. Individuals affect 
and are affected by the environment in 
which they live. The early childhood 
family-environment transaction facili- 
tates or inhibits creativity. An empirical 
definition of manifest creativity is sug- 
gested by Stein (1956): 
Creativity is that process which results in a 
novel work that is accepted as tenable or 
useful or satisfying by a group at some point 
in time [p. 172]. 
Potential creativity is suggested when an 
individual does not satisfy the require- 
ments of the stated definition, but never- 
theless performs on psychological tests 
like individuals who do manifest creativ- 
ity. In an earlier paper Stein (1953) 
elaborated upon his definition of crea- 
tivity. 

EMPHASIS ON MEASUREMENT 

Factor Analytic Approach 


Since the publication of Chassell's 
(1916) paper numerous investigators 
have attempted to devise or adapt tests 
that would measure creative abilities. 
Although the types of tests have not 
changed very much over the past 55 
years, the methods of analysis have be- 
come more complex. For example, Guil- 
ford attempted to define the entire 
structure of intellect by factor analytic 
methods. In one of the more recent re- 
visions of his system, he presented a 
“unified theory of intellect" making use 
of a cubical model of intellectual abilities 
in which each dimension represents a 
mode of variation among the factors 
(Guilford, 1959a), The lack of psycho- 
logical knowledge in the area of creativ- 
ity may be attributable, according to 
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Guilford (1959b) to the inappropriate- 
ness of the SR model for the study of 
higher processes. 

Instead, Guilford (1959b) recom- 
mended a trait approach for the study of 
creativity and stated that the most de- 
fensible way of discovering dependable 
trait concepts is factor analysis. He at- 
tempted to place his research on crea- 
tivity within the larger context of the 
structure of intellect. Noting some 47 
known factors of intellect, Guilford sug- 
gested that they can be put into a three- 
way classification according to: the kind 
of material or content of thought, the 
varieties of activities or operations per- 
formed, and the varieties of resultant 
products. In this system each primary 
intellectual ability represents the inter- 
action of a kind of operation applied to 
a kind of material, yielding a kind of 
product. Most needed, according to 
Guilford, was a more thorough under- 
standing of the nature and components 
of intellect. Accordingly, most of the 
data reported concern- the isolation ofa 
primary factor believed to be of impor- 
tance for creativity. : 

The factorial aptitude traits that Guil- 
ford currently believes to be related to 
creativity are described as: ability to 
see problems, fluency of thinking (the 
factors of word fluency and ideational 
fluency), flexibility of thinking (the 
factors of spontaneous flexibility and 
adaptive flexibility), originality, redefi- 
nition, and elaboration. The types of 
cognitive abilities Guilford believes to 
be of importance for creativity are re- 
flected in the measuring devices he has 
designed or adapted. Very briefly de- 
scribed, his tests require individuals to 
state defects or deficiencies in common 
implements or institutions; to produce 
words containing a specified letter or 
combination of letters; to produce in a 
limited time as many synonyms as they 
can for a stimulus word; to produce 
phrases or sentences; to name objects 


552 


with certain properties (for example, 
objects that are hard, white, and 
edible); or to give various uses for a 
common object. Guilford’s (1959b) 
practice in scoring fluency factors is to 
emphasize sheer quantity—"quality 
need not be considered so long as re- 
sponses are appropriate [p. 146]." Other 
tests employed ask examinees which of 
a given list of objects could best be 
adapted to make another object; or to 
construct a more complex object from 
one or two simple lines. 

Guilford (1959b) presented three 
ways to measure the trait of originality: 
counting the number of responses that 
are judged to be clever, utilizing items 
calling for remote associations, and 
weighting the subject’s responses in pro- 
portion to their infrequency of occur- 
rence in a population of subjects. The 
first two procedures require a quality 
criterion. 

Much of the research efforts of Guil- 
ford and associates has been devoted to 
the definition of factor traits by isolat- 
ing patterns of concomitant variation 
(see Guilford, 1957; Guilford, Kettner, 
& Christensen, 1954, 1956; Kettner, 
Guilford, & Christensen, 1959). The 
studies reviewed herein cluster into two 
groups: (a) those studies demonstrating 
a relationship between measures of the 
factors and criterion variables and (b) 
those studies suggesting no relationship 
between measures of the factors and 
judged creativity. 

Correlations of .25 between grades in 
an astronomy course and performance 
on a test of expressional fluency, .37 
between scores on a test of ideational 
fluency and a criterion of engineer per- 
formance based on pay increases, and .31 
between a measure of adaptive flexibility 
and the pay increase criterion were 
reported by Guilford (1956). Adaptive 
flexibility was reported to have consist- 
ently shown a relationship to perform- 
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ance in mathematics (average r .33). 
Three Guilford originality tests (Unu- 
sual Uses, Consequences, and Plot 
Titles) were reported by Barron (1956) 
to correlate in the range of .30-.36 with 
10 judges’ ratings of originality. Signifi- 
cant multiple correlations were reported 
by Chorness (1956) between a com- 
posite factor score from the Guilford 
battery and United States Air Force 
student-instructor characteristics judged 
to be demonstrative of creative expres- 
sion. The best single predictor was a 
test of controlled associations. Statis- 
tically removing the effects of intelli- 
gence demonstrated that the creativity 
tests could be employed as predictors of 
instructor performance since the factor 
composite predicted the student instruc- 
tor grades for the phase of the program 
studied better than an intelligence index 
which had previously been relied on. 

No significant differences between 
groups rated as creative or not creative 
on the factors of redefinition, closure, 
ideational fluency, associational fluency, 
spontaneous flexibility, sensitivity to 
problems, and originality were found by 
Drevdahl (1956). Similarly, Gerry, 
DeVeau, and Chorness (1957) reported 
no significant differences on the Guilford 
battery between awarded and non- 
awarded employees when the groups had 
been equated for intelligence, job per- 
formance, and education. 

Surveying several years of research on 
creative, effective people, MacKinnon 
(1961) stated that in all samples 
studied, the Guilford tests, scored for 
quantity or quality, did not correlate 
well with the degree of creativity a5 
judged by experts in the subjects' own 
fields. Substantiating this, correlations 
reported by Gough (1961) between 
criterion ratings of creativity and several — 
of the Guilford tests were: Unusual Uses 
(quantity —.05, quality .27); Conse- 
quences (quantity —.27, quality —.12); 
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Matchsticks (.04); Gestalt Transfor- 
mations (.27). 


Relationship between Measured Crea- 
tivity and Measured Intelligence 


Using an IQ measure (Stanford-Binet, 
Wechsler Scale for Children, or Hen- 
mon-Nelson) and five creativity meas- 
ures (Word Association, Uses for Things, 
Hidden Shapes, Fables, and Make-Up 
Problems), Getzels and Jackson (1959) 
selected two experimental groups. One 
group was composed of children who 
placed in the top 20% on the creativity 
measures when compared with same- 
sexed age peers, but below the top 2096 
in measured IQ. The second group con- 
sisted of subjects who placed in the top 
20% in IQ, but below the top 20% on 
the creativity measures. Despite the 
similarity in mean IQ between the high 
creative group (IQ=127) and total 
population (IQ=132), and despite the 
23-point difference in mean IQ between 
the two experimental groups in favor of 
the high intelligence group (IQ=150), 
the achievement scores of the two ex- 
perimental groups on standard subject- 
matter tests were equally superior to the 
achievement scores of the remainder of 
the school population, These data are 
discussed more fully in a recent volume 
(Getzels & Jackson, 1962). 

The main criticisms of the Getzels- 
Jackson report have centered around the 
use of a single atypical school. Torrance 
studied creative thinking in the early 
school years (see Torrance, 1958, 1959a, 
1959b, 1959c, 1959d, 1959e, 1960a, 
1960b, 1960c, 1960d; Torrance, Baker, 
& Bowers, 1959; Torrance & Radig, 
1959) and brought together eight partial 
replications of the Getzels-Jackson 
study. Two batteries of creativity tests 
were used; both consisted of modifica- 
tions of Guilford-type tests with the ex- 
ception of the Ask and Guess Test de- 
veloped by Torrance (Torrance & Radig, 
1959). The procedure followed by Tor- 
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rance (1960c) is similar to Getzels and 
Jackson in that he selected groups who 
placed in the upper 20% on either the 
creativity or IQ measures, but not in 
the upper 20% on the remaining 
measure, 

In six of Torrance’s eight groups, 
there was no significant difference on 
measured achievement between the high 
creative and high intelligence groups. In 
two of the elementary schools (the small 
town school and the parochial school) 
there was a significant difference in 
measured achievement in favor of the 
high intelligence group. The question is 
then raised: Under what conditions do 
“highly creative” pupils achieve as well 
as “highly intelligent” ones? Additional 
data reported by Torrance suggest a 
tendency for the highly creative groups 
to be better on reading and language 
skills than on work-study or arithmetic 
skills. 

Meer and Stein (1955) reported a 
significant relationship between research 
chemists scores on the Wechsler-Belle- 
vue, Miller Analogies, and supervisors 
ratings of creativity. When education 
was controlled, they concluded that with 
opportunity held constant, IQ beyond 
the ninety-fifth percentile is not signifi- 
cant for creative work. Similarly, sum- 
marizing several studies, Barron (1961) 
suggested that a small correlation (about 
40) exists between the total ranges of 
creativity and intelligence. However, 
beyond an IQ of about 120, measured 
intelligence is unimportant for creativ- 
ity. He pointed instead to the impor- 
tance of motivational and stylistic vari- 
ables. 

Criterion Group Empirical A pproach 

The Welsh Figure Preference Test 
(WFPT; Welsh, 1949, 19592, 1959b) 
is a different type of psychometric in- 
strument used in the study of creativity. 
In short, it is a nonlingual test composed 
of 400 India ink drawings, to each of 
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which the examinee must respond “like” 
or “don’t like.” Of primary interest in 
the present context is an empirical scale 
derived by contrasting the likes and dis- 
likes of 37 artists and art students with 
the likes and dislikes of 150 people in 
general (Barron & Welsh, 1952). This 
scale has since been revised by Welsh to 
eliminate any response set and in its 
present form the Revised Art (RA) scale 
consists of 30 drawings that artists like 
more frequently than people do in gen- 
eral, and 30 items that artists dislike 
more often than people do in general. 
Rosen (1955) attempted to use the 
earlier form of the Art scale (BW) as a 
predictor of originality and level of abil- 
ity among artists. He reported a signifi- 
cant difference between artists and art 
students as contrasted with nonartists, 
but no evidence that Art scale score in- 
creased as a function of level of training 
of the artist. One art product of each 
of the students was rated on a 5-point 
scale of originality by each of the art 
faculty. The correlation between the 
Art scale score and the average of the 
ratings was .40. The correlation be- 
tween the Art scale score and the grade- 
point average of the students was .34. 
Rank-order correlations of .40 and 
.35 between scores on the RA scale and 
creative writing instructor’s ratings of 
originality and creativity of their stu- 
dents were reported by Welsh (1959a). 
Gough (1961) reported that the BW 
scale showed the highest single correla- 
tion (.41) with criterion judgments of 
research workers’ creativity. Among the 
many measures which did not correlate 
well with the criterion judgments were 
three ability measures, the Allport- 
Vernon value scales, 56 of the 57 Strong 
Vocational Inventory scales, Barron's 
Originality scale, Barron's Preference 
for Complexity scale, the originality 
coefficient from Gough's Differential Re- 
action Schedule, and the six Guilford 
measures already noted, 
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Data obtained by MacKinnon (1961) 
indicated that a group of highly creative 
architects placed in the same range as 
artists on the BW scale, while a less crea- 
tive group obtained lower scores, and a 
third group not distinguished for its 
creativity scored lowest. 


EMPHASIS ON PERSONALITY 


Personality is another major emphasis 
within the psychological study of crea- 
tivity. It can be subdivided into: (a) 
the study of motivation of creative be- 
havior and (5) the study of personality 
characteristics or life styles of creative 
individuals. Regarding motivation, two 
divergent viewpoints are apparent. One 
describes creative behavior as an emer- 
gent property which matures as the in- 
dividual attempts to realize his fullest 
potentials in his interaction with his en- 
vironment, while the second treats crea- 
tivity as a byproduct of repressed or un- 
acceptable impulses. 

Among the concepts related to the first 
viewpoint are Allport's (1937) func- 
tional autonomy, Goldstein's (1939), 
Roger’s (1954, 1956), and Yacorzyn- 
ski's (1954) self-actualization; as well 
as May's (1959) and Schachtel's (1959) 
motives for creativity. Individuals are 
described as being creative because it is 
satisfying to them since they have a 
need to relate to the world around them 
so they may experience their selves in 
action. 

Antithetical to these views are the 
concepts of psychoanalytic authors who 
have discussed creativity. Freud (1910, 
1924, 1948) originally postulated that 
all cultural achievements are caused by 
the diversion of libidinal energy. This 
displacement, producing higher cultural 
achievements, he called sublimation 
(Freud, 1930). Several authors have 
described creativity as motivated by ef- 
forts to defend against unacceptable im- 
pulses (see Bergler, 1947; Bychowski, 
1951) or as motivated by unconscious 
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restitution for destructive urges (see 
Fairbairn, 1938; Lee, 1947, 1948, 1950; 
Rickman, 1957; Sharpe, 1930, 1950). 
Other reductionistic treatments of crea- 
tivity can be found in the writings of 
Abraham (1949), Adler (1927), Bel- 
lack (1958), Bischler (1937), Brill 
(1931), Ehrenzweig (1949), Grotjahn 
(1957), Kohut (1957), Kris (1952), 
Levey (1940), Rank (1916), and Sachs 
(1951). Criticisms of sublimation theory 
were offered by Bergler (1945), Deri 
(1939), and Levey (1939). 

It is difficult to compare these view- 
points experimentally for several rea- 
sons. Creativity has not been defined by 
either group and it does not seem that 
they are describing the same types of 
behaviors. The reductionistic authors 
most often discuss painting and writing 
in their attempts to explain creativity. 
The self-actualizing group seem to de- 
scribe a much more global style of in- 
teracting with one's environment which 
could lead to products that would be 
judged as creative. Moreover, such con- 
cepts as sublimation and self-actualiza- 
tion are not easily definable or measura- 
ble. There are a few experimental 
studies which have yielded data of vary- 
ing degrees of consistency or inconsist- 
ency with the two views of motivation 
for creative behavior. 


Studies of Motivation for Creativity 


Miinsterberg and Mussen (1953) at- 
tempted to study several hypotheses de- 
rived from psychoanalytic formulations 
of the creative personality. They inter- 
preted the data as supporting the follow- 
ing hypotheses: (a) more artists than 
nonartists have intense guilt feelings, 
(b) more artists are introverted and 
have a rich inner life, (c) more artists 
than nonartists are unable or unwilling 
to comply with their parents. No sup- 
port was reported for the following hy- 
potheses: (4) nonartists are more likely 


to show overt aggressive tendencies, (5) 
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appreciation of the product supplies 
basic narcissistic gratification for the 
artist, (c) the artist interprets appre- 
ciation as evidence that others share his 
guilt. Evidence was reported supporting 
the single hypothesis which was not de- 
rived from psychoanalytic formulations 
—that more artists than nonartists show 
a need for creative self-expression. 

Myden (1959) defined a highly crea- 
tive group by choosing 20 subjects from 
“the top rank” of diverse fields of the 
arts. Content and formal analysis of the 
Rorschach suggested that the creative 
group did utilize primary process signifi- 
cantly more than the noncreative group. 
Myden stated that in the creative in- 
dividual the primary process appeared 
to be integrated with the secondary 
process and did not seem to arise from, 
or increase, anxiety. Regression ap- 
peared to be a part of the thinking of 
creative individuals, rather than symp- 
tomatic of loss of ego control. No quan- 
titative difference in anxiety was appar- 
ent between the two groups. The 
creative group was reported to employ 
significantly less repression than did the 
noncreative group. Myden believed that 
this may account for the finding that 
they show a greater amount of psycho- 
sexual ambivalence. 

One large difference between the two 
groups, which is not considered in the 
psychoanalytic literature, was noted to 
be a significantly stronger sense of psy- 
chological role-in-life characteristic of 
the creative group. Myden (1959) de- 
scribed them as “inner-directed and not 
easily swayed by outside reactions and 
opinions [p. 156].” ( 

Golann (1961, 1962) proposed a hy- 
pothetical construct—the creativity mo- 
tive—through which he attempted to 
express the view that creative products 
are only one segment of creative be- 
havior which becomes manifest when in- 
dividuals actively interact with their 
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environment so as to experience their 
fullest perceptual, cognitive, and expres- 
sive potentials. He argued that high 
creativity motive subjects should prefer 
stimuli and situations which allow for 
idiosyncratic ways of dealing with them. 
In an attempt to demonstrate this, and 
in an attempt towards explanation of 
positive correlations between the Art 
scales of the WFPT and judged crea- 
tivity in painting, writing, and research, 
it was shown that the 30 RA scale items 
liked by artists were significantly more 
ambiguous than the RA items artists 
did not like. A second study revealed 
that individuals who scored high on the 
RA scale, subjects who preferred the 
ambiguous, evocative figures, indicated 
preference on a questionnaire for activi- 
ties and situations which allowed more 
self-expression and utilization of crea- 
tive capacity, in contrast to low RA sub- 
jects who preferred more routine, struc- 
tured, and assigned activities. 
Personality attributes of creative in- 
dividuals have been treated through 
experimental study and theoretical de- 
scriptive reports. Maslow's (1959) de- 
scription of self-actualizing creativeness 
and Roger's (1954) discussion of condi- 
tions within the individual that are 
closely associated with a potentially 
creative act are highly similar. Both 
authors placed a great deal of impor- 
tance on openness to experience rather 
than premature conceptualization, and 
on an internal locus of evaluation rather 
than over concern with the opinions of 
others. The theme of individuals' desire 
to fully achieve their potentials through 
their interaction with the environment is 
prominent in these writings. Similar or 
related observations have been made by 
Fromm (1959), Murphy (1947, 1958), 
and Mooney (1953a, 1953b). 


Studies of Personality Attributes of 
Creative Individuals 


The experimental study of personality 
attributes of creative individuals tends 
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to contrast criterion groups on either 
self-descriptions, others’ descriptions, 
test performance, life history material, 
or work habits. The criterion groups 
have been selected on the basis of either 
ratings of creativity, performance on 
Guilford tests, scores on BW or RA of 
the WFPT, or nomination of individuals 
of outstanding creativity by a panel of 
experts in their field. 

The relationship between self-descrip- 
tion and degree of creativity has been 
studied by several investigators. Bar- 
ron (1952) reported that subjects at the 
lower extreme on the BW scale described 
themselves as contented, gentle, con- 
servative, unaffected, patient, and peace- 
able. In contrast, the high BW subjects 
characterized themselves as gloomy, 
loud, unstable, bitter, cool, dissatisfied, 
pessimistic, emotional, irritable, and 
pleasure seeking. Similar results were 
reported by Barron (1958) in a later 
study. Relating self-descriptions to 4 
productivity criterion VanZelst and Kerr 
(1954) reported that productive scien- 
tists described themselves as more orig- 
inal, imaginative, curious, enthusiastic, 
and impulsive, and as less contented and 
conventional. Stein (1956) reported 
that creative subjects regard themselves 
as assertive and authoritative, while less 
creative regard themselves as acquies- 
cent and submissive. Self-descriptions 


. for highly creative and less creative 


female mathematicians have been re- 
ported by Helson (1961) and by Mac- 
Kinnon (1961) for groups of architects 
varying in creativeness. MacKinnon re- 
ported that the highly creative stress 
their inventiveness, independence, indi- 
viduality, enthusiasm, determination, 
and industry while the less creative 
stress virtue, good character, rational- 
ity, and concern for others. He sug- 
gested that the highly creative are able 
to speak frankly, in a more unusual way 
about themselves because they are more 
self-accepting than their less creative 
colleagues (see also Barron, 1961). 
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A dimension similar to that apparent 
in the self-descriptions is reflected in the 
test performance of subjects varying in 
creativity. The values of subjects at the 
extremes on the BW scale were inferred 
by Barron (1952) from fine arts pref- 
erences. He reported that low BW sub- 
jects approved of good breeding, for- 
mality, religion, and authority and 
rejected the daring, esoteric, or sensual. 
In contrast, high BW subjects approved 
of the modern, experimental, primitive, 
and sensual while they disliked the aris- 
tocratic, traditional, and emotionally 
controlled. 

Barron (1953) equated performance 
on the BW scale with a bipolar factor 
of preference for perceiving and dealing 
with complexity as opposed to prefer- 
ence for simplicity. Positive relation- 
ships reported for preference of com- 
plexity included: personal tempo, verbal 
fluency, impulsiveness, expansiveness, 
originality, sensuality, sentience, es- 
thetic interest, and femininity in men. 
Negative relationships of preference for 
complexity included: rigidity, constric- 
tion, repressive impulse control, political- 
economic conservatism, subservience to 
authority, ethnocentrism, and social con- 
formity. This dimension is discussed 
more fully in a later report (Barron, 
1961). 

Studying the relationship between 
aptitude and nonaptitude factors, Guil- 
ford (1957) stated that the intercorre- 
lations were generally low. Subjects who 
scored higher on ideational fluency were 
more impulsive, self-confident, ascend- 
ent, more appreciative of originality and 
somewhat less inclined towards neu- 
roticism. Subjects higher on originality 
were more interested in esthetic expres- 
sion, reflective and divergent thinking, 
more tolerant of ambiguity, and felt less 
need for orderliness. 

Independence as a personality at- 
tribute was stressed in several theoreti- 
cal discussions of creativity. Barron 
(1953b) reported that subjects who did 
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not yield to the incorrect group con- 
sensus in the Asch line judgment situa- 
tion scored significantly higher on the 
BW scale than a group of yielders. Bar- 
ron (1961) also noted that subjects who 
regularly perform in a creative or orig- 
inal manner on Guilford tests are inde- 
pendent in judgment when put under 
pressure to conform to a group opinion 
in conflict with their own. 

The suggestion that the real difference 
between high and low creative individ- 
uals might be a function of the lows’ 
defensiveness which inhibits generaliza- 
tion and communication of hypotheses 
was offered by Stein and Meer (1954). 
They administered the Rorschach to 
subjects at exposures ranging from .01 
second to full. Their scoring system 
gave the highest score to a well-inte- 
grated response given to a difficult card 
at the shortest exposure. A biserial cor- 
relation of .88 between total weighted 
score and criterion creativity ratings 
was reported. 

The work style of similarly employed 
individuals varying in creativity has 
also been the object of study. Roe 


(1949) reported that biologists selected . 


for eminence in research were very un- 
aggressive, had little interest in inter- 
personal relations, were unwilling to go 
beyond the data presented, and pre- 
ferred concrete reality to the imaginary. 
Other data on Rorschach and Thematic 
Apperception Test performances of 
groups of emminent scientists are dis- 
cussed in this and other reports (see 
Roe, 1946, 1949, 1951, 1952). Bloom 
(1956) administered projective tech- 
niques to outstanding scientists and re- 
ported personality and temperamental 
characteristics similar to those described 
by Roe. The willingness to work hard 
seems to be the most general character- 
istic of the samples studied. 

Two research styles were reported by 
Gough (1961) to correlate with cri- 
terion ratings of creativity: the man 
who is dedicated to research and sees 
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himself as a driving researcher with ex- 
ceptional mathematical skills; the man 
with wide interests, analytic in thinking, 
who prefers research which lends itself 
to elegant, formal solutions. In a previ- 
ous paper, Gough (1958) had described 
eight types of researchers and how these 
were conceptualized. 

Another source of data bearing on 
creativity and personality is life history 
material. Roe (1953) reported that so- 
cial scientists’ interaction with their 
parents involved overprotection while 
physical and biological scientists de- 
veloped early a way of life not requiring 
personal interaction. 

A negative relationship between rated 
creativity and socioeconomic as well as 
educational status of the parents has 
been reported by Stein (1956). Creative 
subjects were more likely to feel that 
their parents were inconsistent in atti- 
tudes towards them. Less creative sub- 
jects were more likely to engage in group 
activities in childhood while the more 
creative preferred solitary activities. 
Similar trends were reported in extensive 
biographical studies by Cattell (1959). 
MacKinnon (1961) reported relation- 
ships between life history material and 
rated creativity which require and war- 
rant further investigation. 

Crutchfield (1961) attempted to de- 
scribe personality attributes which tend 
to characterize creative individuals in 
general. He reported that in cognitive 
spheres they are more flexible and flu- 
ent; their perceptions and cognitions are 
unique. In approach to problems they 
are intuitive, empathic, perceptually 
open, and prefer complexity. In emo- 
tional-motivational spheres they demon- 
strate freedom from excessive impulse 
control, achieve via independence rather 
than conformity, are individualistic, and 
have strong, sustained, intrinsic motiva- 
tion in their field of work. 
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Studies with Children 


Few studies have been reported on 
creativity in children despite the great 
interest in the creativeness of childhood. 
Mattil (1953) attempted to study the 
relationship between the creative prod- 
ucts of children and their adjustment. 
The data led Mattil to conclude that ele- 
ments of adjustment and mental abili- 
ties are directly related to creative 
products. 

Limited data on creativity in children 
is included in a monograph by Adkins 
and her associates (Adkins, Cobb, Mil- 
ler, Sanford, Stewart, Aub, Burke, 
Nathanson, Stuart, & Towne, 1943). 
Teacher ratings of creativity in school 
children are reported to correlate posi- 
tively with independent measures of the 
following variables: .81 with need for 
sentience (pleasures); .65 with intra- 
ception (imaginative, subjective, human 
outlook); .65 with the need to produce, 
organize, or build things; .63 with the 
need for understanding; .60 with the 
need to explain, judge, or interpret; .50 
with the need to restrive after failure 
and to overcome weakness; .50 with the 
enjoying of thought and emotion for its 
own sake or preoccupation with inner 
activities. Negative correlations Te- 
ported for the same teacher ratings in- 
clude: —.79 with sameness (adherence 
to places, people, and modes of conduct; 
rigidity of habits); —.57 with the need 
of acquisition; and —.54 with the need 
to reject others. 

Reid, King, and Wickwire (1959) re- 
ported that creative subjects exhibited 
superior performance on almost all cog- 
nitive variables, indicating that cogni- 
tive abilities (as measured by general 
intelligence, aptitude, and achievement 
instruments) are related to peer nomi- 
nations of creativity. While these T€ 
sults can be interpreted as generally con- 
sistent with studies on adult populations; 
the findings that the creative group was 
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significantly higher on cyclothymia, 
while the noncreatives were higher on 
schizothymia, contradicts replicated re- 
sults with adults who have been de- 
scribed as withdrawn or individualistic 
and have themselves said they preferred 
individual pursuits as children. 

A recent study by Torrance (1959d), 


as having good ideas and receive more 
friendship choices. Thus, highly talka- 
tive children tend to earn higher scores 
on the verbal test of creativity, but not 
on the nonverbal measure. Torrance’s 
results suggested that a sociometric cri- 
terion will select children with well-de- 
veloped and exercised verbal abilities 
who are not necessarily more creative 
than many of their peers. A wide range 
of issues concerning the development of 
creativity in children is discussed in a 
paper by Lowenfeld (1959). 


CRITICAL OVERVIEW 

What is creativity? Creativity has 
been viewed as a normally distributed 
trait; as such its investigation has pro- 
ceeded in an attempt to find product cri- 
teria from which the presence or absence 
of the trait in an individual could be 
inferred. Creativity has been viewed as 
the outcome of a complex of aptitude 
traits; as such its investigation has 
proceeded in an attempt to demonstrate 
the presence of such traits through fac- 


tor analysis and to develop measuring 
instruments, Creativity has been viewed 
as a process culminating in a new 
thought or insight; as such its investiga- 
tion has proceeded by introspective re- 
porting, or investigator observation of 
the temporal sequence, Creativity has 
been described as a style of life, the per- 
sonality in action; as such its investiga- 
tion has been concerned with personality 
descriptions and assessment of people 
believed to be creative and investigation 
of motives for creativity. 

All of the possible emphases within 
the study of creativity require no justi- 
fication other than noting that cach is 
capable of making important contribu- 
tions. It would seem, however, that dat 

by Taylor, Smith, and Ghiselin 
(1959), which indicated a very low de- 
gree of association among the many pos- 
sible product criteria, argue against the 
likelihood of a product approach provid- 
ing a comprehensive understanding of 
creativity. Crutchfield's (1961) discus- 
sion of the creative process should be 
helpful to those attempting experimental 
studies. His explanation of illumination 
will require careful study in the light of 
recent reports (see Spielberger, 1962) 
which suggest that the examiner's aware- 
ness of the subject's awareness may be a 
function of the extent of the postexperi- 
mental interview. Difficulty may arise , 
when investigators, working within one | 
area of emphases, with one explicit or .. 
implied definition and set of criteria, 
lose sight of the inherent limitations of 
their choices. The point can perhaps be P. 
illustrated by a reconsideration of the 
relationship between creativity and in- 
telligence, 

The studies by Getzels and Jackson 
(1959) and Torrance (1959e) indicated 
that measured intellectual ability and 
measured creative ability are by no 
means synonymous. Torrance presented 
additional data which indicated that in 
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his sample, using only the Wechsler In- 
telligence Scale for Children, to deter- 
mine giftedness would have excluded 
70% of the children placing in the upper 
20% on the creativity measures. The 
same ratios obtain using other measures 
of intelligence. Meer and Stein (1955), 
Barron (1961), and MacKinnon (1961) 
agree essentially that while there is a 
correlation over the entire ranges of in- 
telligence and creativity, the magnitude 
of the correlation varies greatly at differ- 
ent levels of intelligence. Meer and 
Stein cite the ninety-fifth percentile, 
Barron an IQ of 120, as the approximate 
point above which intelligence is unim- 
portant for creativity. The point that 
needs to be stressed is that these data 
are in a sense arbitrary: intelligence is 
not performance on a test; creativity is 
more than test performance or being 
judged as creative. What is needed for 
the understanding of the relationship 
between creativity and intelligence is not 
only data at the correlational level, but 
conceptual reorganization as well. Just 
as the choice of a series of Guilford 
tests or judgment procedures implies one 
definition of creativity, the choice of an 
intelligence test implies one of many 
possible definitions of intelligence. T. 
Taylor (1959), for example, believes in- 
telligence to be an invention of Western 
culture, which stresses how fast rela- 
tively unimportant problems can be 
solved without making errors. He feels 
that another culture might choose to 
measure intelligence in a way more con- 
gruent with a high level of creativity. 

For these reasons Guilford has at- 
tempted to employ a wide variety of cri- 
terion measures, grouped by factor 
analysis, and study relationships among 
the factors, 

One could, however, select criterion 
measures on the basis of theoretical con- 
structs and still pay careful attention to 
the predictive efficacy of the criterion 
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and compare its predictive ability with 
other possible selection criterion. This 
author agrees with Guilford that what 
is needed is better understanding of the 
nature of intellect but does not agree 
that factor analysis presents the best 
way of defining one's constructs. The 
factor analytic approach does not solve 
the problem of how well the measuring 
instrument is sensitive to variations in 
the construct its user believes it to meas- 
ure. Tt does not seem that factor analy- 
sis will, itself, enrich basic understand- 
ing of creative phenomena. - Required? 
are not only data at a correlational level, | 
but a developmental understanding as 
well, and also an understanding of dif- \ 
ferent situations where different correla- 
tions are obtained between the same cri- 
teria. E 

If the choice is made to select sub- 
jects on conceptual rather than factor 
analytic bases it would seem that the 
investigator should attempt in some 
manner to isolate the contribution of a 
single criterion choice. The point to be 
made can perhaps best be seen in the 
work reported by Barron and Mac- 
Kinnon. Their studies utilized a com- 
pound criterion in the selection of sub- 
jects: creative, effective individuals. 
The criteria of creativity in all cases 
were judgments clearly the most care- 
fully collected, reliable of all those re- 
ported, but judgments nonetheless. The 
reports by Harmon (1958) and D: 
Taylor (1958) indicated that judges 
ratings of creativity seem heavily deter- 
mined by the productivity of the in- 
dividual. Note that the self-descriptions 
of productive scientists reported by 
VanZelst and Kerr (1954) are very 
similar to the self-descriptions of crea- 
tive individuals reported by Gough, 
Harmon, and MacKinnon. While it is 
crucial that creative, productive people 
be studied as such, it must be kept in 
mind that the portion of the reported 
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findings attributable to creativity can- 
not be separated from that portion at- 
tributable to that which makes for pro- 
ductivity, and that which leads to being 
seen by judges as creative. It is possi- 
ble, and perhaps we are now ready to 
utilize personality and stylistic modes 
as criterion variables. In such an ap- 
proach our criterion variables might be 
tolerance for or seeking of ambiguity, 
openness to experience, childlike traits, 
self-actualization or expression, internal 
frames of evaluation, or independence of 
judgment, to name but a few theoreti- 
cally based descriptive concepts which 
appear again and again in the literature 
and deserve further investigation. The 
important questions would then become: 
how do these cognitive, stylistic, or mo- 
tivational modes of interacting with 
one's environment develop?; What are 
the environmental, interpersonal, and 
intrapersonal conditions that tend to fa- 
cilitate or discourage them?"How in turn 
are these factors related at different age 
levels to behavior which is judged to be 
creative, effective, and productive? In 
no sense would this approach solve the 
problem of using judgments. The argu- 
ment is not that this approach corrects 
or circumvents most of the problems in- 
herent in other approaches. However, it 
is my belief that the use of theoretically 
derived personality factors as criterion 
variables has, because of its own in- 
herent difficulties, been neglected, yet 
holds most promise of providing a func- 
tional developmental understanding of 
creativity. 
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Theorists have proposed relationships between response to color and 
personality attributes, mainly those of impulsivity, suggestibility, and 
emotionality, A review of research results reveals no support for its 
association with impulsivity, or related variables (assaultiveness, ego 
control, etc.). Support is also lacking for its correlation to suggestibility 
in terms of being easily influenced by individuals, though there is dem- 
onstration of its relationship in terms of responding to stimulation 
from the impersonal environment, color being a part of this environ- 
ment. Strong evidence from Rorschach studies for reduced use of color 
by depressed individuals, the only relationship to emotionality estab- 
lished, is compatible with the above positive finding, for the depressed 


patient is uninterested in the external environment. 


Rorschach's hypotheses concerning 

relationships between aífect and response 
to color have prompted much experi- 
mental work, and have been applied in 
the clinical interpretation of the use of 
color in Rorschach responses as well as 
in paintings, the Mosaic Test, etc. Ben- 
ton's (1952) observations on the results 
of experimental studies in this area indi- 
cated that further study was needed: 
It is obvious that the last word on this puz- 
zling problem of the meaning of the color 
responses has not been said. But enough 
returns have come in to indicate that a radi- 
cal revision of our ideas concerning the sig- 
ET of color in the Rorschach is in order 
[p. 


Since this statement was made, much 
research relevant to this problem has 
been published. This article is designed 
to review the research in this area, espe- 
cially those results since 1952. 
Rorschach (1949) postulated that the 
way in which a person responds to color 
in ink blots reflects his typical method of 
dealing with affect. He defined relation- 
ships of various color scores of the Ror- 
schach to personality variables such as 
impulsivity, suggestibility, and emo- 
tionality, but used terms which have 
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been criticized as too vague. Norman 
and Scott (1952) stated that the theo- 
retical positions of the early workers in 
this area are “vague and full of non- 
referent words.” The term “affect” as 
used by Rorschach and others is espe- 
cially unacceptable to Norman and 
Scott, as well as to Shapiro (1956). The 
latter asserted: 


it is obvious that it has been used in different 
senses by different writers. Some have used 
“affect” and "impulse" interchangeably . - + 
while others have distinguished between these 
concepts, Some have also considered color 
responses to reflect an inclination to action, 
and at times concepts relating to affect, im- 
pulse, and action are all used interchange- 
ably [p. 52]. 


The theoretical articles of Rickers- 
Ovsiankina (1943) and Schachtel (1943) 
led Shapiro (1956) to conclude that: 
the common assumption seems to be that the 
process of color perception is one which re- 
quires less delay, in a manner of speaking w 
effort, or less ego activity than the process o! 
form perception [p. 55]. 


He hypothesized that subjects using 
large amounts of color on the Rorschach 
would be either hysterics, who have 2” 
abundance of affect without prominent 
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impulsivity; psychopathic character dis- 
orders, who are characterized by impul- 
sivity with little affect; or regressed 
schizophrenics, who are characterized by 
neither affect nor action. 

A similar position was taken by 
Fortier (1953): 
That individual who responds to relatively 
undifferentiated color possesses an ego which 
is less able to control and channel affective 
charge. Such an individual lacks spontaneity 
of action and readily adopts the color of his 
environment [p. 43]. 


Research evidence involving the use 
of color will be reviewed here in relation 
to those personality variables specified 
in the above theoretical statements—im- 
pulsivity or difficulties in emotional con- 
trol, suggestibility or passivity, and the 
abundance of affect or emotion. 

The personality correlates of the use 
of color in perception have been studied 
most extensively by means of the 
Rorschach test, resulting in a predomi- 
nance of Rorschach studies in this re- 
view. Results using other perceptual 
tasks, such as the Lowenfeld Mosaic 
Test, colored versions of the Thematic 
Apperception Test, color-form sorting 
tasks, and subjects' drawings, are much 
more limited, but will be discussed in 
relation to comparable Rorschach stud- 
ies since evidence indicates significant 
relationships between use of color in the 
Rorschach and in other perceptual 
tasks. Holzberg and Schleifer (1955) 
obtained five significant correlations 
ranging around .40 between Rorschach 
measures and performance on various 
perceptual and associative tasks. Allred 
(1959) reported a correlation of .59 for 
the relationship between the use of color 
in drawings and the Sum C score on the 
Rorschach. A study by Matarazzo, 
Watson, and Ulett (1952) demonstrated 
a significant relationship between Sum 
C scores on the Rorschach and a 
similar measure in a perceptual task 
involving intermittent photic stimula- 
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tion, though the relationship was not 
found among anxious subjects. In con- 
trast, the use of color in the Rorschach 
was not found to be comparable to re- 
sponses on color-form tasks (Keehn, 
1953). 

A number of studies (Baughman, 
1958) using chromatic and achromatic 
versions of the Rorschach cards have 
indicated that the percentage of total 

given to the last three cards 
,'9, 10%), as well as the color 
signs, are not influenced by the 
in the blots. Other Rorschach 
scores are by definition dependent on the 
coloring of the cards and are the con- 
cern of this review. 


(the 8, 
shock 
color 


Imputstvity OR DIFFICULTIES IN 
CONTROL 


Only two experimental studies directly 
investigating relationships between im- 
pulsivity and color variables on the 
Rorschach were found in the literature 
and these yielded contradictory results. 
Both studies used ratings of impulsivity 
in college students. Gardner (1951) 
used ratings by three graduate students 
who were well-acquainted with the 10 
subjects and obtained significant rela- 
tionships between impulsivity ratings 
and various Rorschach color scores. 
Rank-order correlations ranged from .79 
for the percentage of color responses in 
the total number of responses to .88 for 
the ratio of CF and C responses to FC 
responses. Holtzman (1950) used rat- 
ings by fraternity members of the degree 
of impulsivity in their fellow members. 
Only one Rorschach score was used, the 
CF:FC ratio, and the results were incon- 
clusive, for the correlation with ratings 
of impulsivity was .42 in one fraternity 
but dropped to .07 in a second fra- 
ternity. 

Some support for the postulated rela- 
tionships between response to color and 
impulsivity is given by investigators 
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working with drawings, but statistical 
analysis of their data was not used. 
Alschuler and Hattwick (1947) and 
Precker (1950), for example, stated that 
they had observed that subjects using 
color to the greatest extent in a drawing 
were impulsive and uncontrolled. 

The assaultive patient's difficulty in 
controlling feelings of hostility would 
lead one to expect increased use of color 
in his Rorschach responses. Two pub- 
lished studies in this area again yielded 
contradictory results. Storment and 
Finney (1953) used groups of assaultive 
and nonassaultive patients matched on 
many other variables that the investiga- 
tors considered might have been con- 
founding. From all the commonly used 
Rorschach scoring categories, the only 
significant relationship found involved 
the frequency of "form minus" color 
responses. Finney (1955) compared 
assaultive and nonassaultive patients 
matched for the same variables, but ob- 
tained significantly lower CF and Sum C 
scores from the nonassaultive patients. 
Cerbus and Nichols (1962) obtained an 
insignificant correlation between ratings 
of patients' expression of hostility and a 
scale measuring the degree of preference 
for achromatic pictures over chromatic 
pictures. 

Studies investigating the use of color 
on the Rorschach by epileptics, psycho- 
paths, and delinquents would be ex- 
pected to bear on this question because 
of the impulsivity that characterizes 
these groups. Fortier (1953) cited three 
references which agreed that the epilep- 
tics gave fewer color responses than the 
normal individual. Similarly, Pruyser 
and Folsom (1955) compared nonhos- 
pitalized epileptics with norms published 
by Beck and his associates, finding that 
the epileptics scored significantly lower 
on FC, CF, C, and Sum C scores. 
Heuser (1946) reported Rorschach 

scores of a group of army psychopaths, 
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but made no statistical comparisons with 
scores of a normal group. In any event, 
the average Sum C score of the psycho- 
paths was not higher than average for 
normals for it would fall at the fortieth 
percentile on the norms published by 
Cass and McReynolds (1951), which 
were supported by norms given by 
Brockway, Gleser, and Ulett (1954). 
In regard to juvenile delinquents, 
Schachtel's (1951) research, in collabo- 
‘ration with Glueck and Glueck in their 
comprehensive study of delinquency, 
‘compared Rorschach scores of two 
matched groups of 500 boys each, but 
no significant differences in color scores 
were found. Fortier (1953) cited a study 
by Endacott on 100 delinquent boys 
which reported a median Sum C score 
which was lower than that given in a 
normative study of boys of that age, but 
no statistical tests were made, Boynton 
and Walsworth (1943), similarly, found 
that delinquent girls were significantly 
lower than nondelinquent girls on Sum 
C. Comparable results were obtained 
with other techniques by Cerbus and 
Nichols (1962), using picture prefer- 
ences, and by Phillips and Stromberg 
(1948), using finger paintings. Cerbus 
obtained a marginally significant corre- 
lation between the MMPI Pd scale and 
a scale of preferences for achromatic 
over chromatic pictures. Phillips and 
Stromberg found no significant differ- 
ence in the numbers of delinquents and 
nondelinquents using only one color in 
the first production, though significantly 
more delinquents used only one color in 
their second productions. It is note- 
worthy that none of the studies in this 
area found significantly increased use 
of color by clinical groups characterized 
by impulsivity, though some studies did 
find significantly reduced use of color 
by these groups. 

A measure of ego control has been 
shown to be unrelated to color respon- 
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siveness. Cerbus and Nichols (1962) 
found no significant relationship between 
the measure of Self-Control on the Cali- 
fornia Psychological Inventory and pref- 
erences for achromatic and chromatic 
pictures. Ego control would be expected 
to have some commonality with ego 
strength, which also has not been firmly 
related to color responsiveness. The Ego 
Strength scale of the MMPI was not 
significantly correlated to preference for 
achromatic pictures (Cerbus & Nichols, 
1962), although the factor analysis of 
Rorschach variables and MMPI scales 
by Williams and Lawrence (1954) 
yielded a factor with high positive load- 
ings of Barron’s Ego Strength scale and 
of CF and C scores of the Rorschach, 
especially on oblique rotation. Barron 
(1953) operationally defined ego 
strength as treatment prognosis but the 
latter has also been demonstrated to be 
unrelated to response to color. Success- 
ful treatment of schizophrenic patients 
was found to be related to fewer CF and 
C responses by Stotsky (1952); results 
reaching statistical significance in one 
sample but not in the second. Filmer- 
Bennett (1952) reviewed nine studies 
which indicated an emphasis of color re- 
sponses in the Rorschach protocols of 
patients making best response to treat- 
ment, but then reported his own study, 
using more rigorous statistical proce- 
dures than were used in many of the 
earlier studies, which demonstrated that 
the Rorschach color scores were not sig- 
nificantly related to improvement under 
psychiatric treatment. 

Summarizing the above results, in- 
vestigations of the relationships between 
color responsiveness and impulsivity, as- 
saultiveness, and ego strength or ego 
control are equivocal; investigations of 
epileptics, psychopaths, and delinquents, 
commonly thought of as lacking in im- 
pulse control, are consistent only to the 
extent that these clinical groups are not 
significantly more responsive to color. 
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SUGGESTIBILITY AND PASSIVITY 


Investigations of the relationships be- 
tween perceptual use of color and sug- 
gestibility also give indecisive results. 
Two available studies related Rorschach 
scores to social conformity as measured 
by shifts in judgment of autokinetic 
movement to conform to "judgments" 
of a confederate of the experimenter. 
Linton (1954) reported that subjects 
who increased their estimates as a result 
of the judgments of a confederate ob- 
tained significantly higher Sum C scores, 
while Steisel (1952) obtained similarly 
significant results with CF scores when 
the judgments of a confederate were 
directed at reducing the estimates made 
by the subject, but insignificant results 
when the judgments were directed at 
increasing the estimates made by the 
subject. Brennan (1943) reported that 
good subjects for hypnosis, highly sug- 
gestible individuals, have significantly 
higher Sum C scores. 

Females are typically more passive 
than males and would theoretically be 
expected to make greater use of color 
in perception. Hays (1952) found a 
significant tendency for females to have 
a higher incidence of color emphasis in 
the Experience Balance, but Felzer 
(1955) found scores on only one color 
measure, FC, significantly higher for 
females than males. Finally, no signifi- 
cant difference in color scores of males 
and females was found by Richards and 
Murray (1958). 

Although color responsiveness has not 
been shown to be related to suggestibil- 
ity in the form of tendency to be influ- 
enced by other individuals, there is 
evidence for its relationship to “sug- 
gestibility” in the form of tendency to 
be influenced by the nonpersonal en- 
vironment. The CF and C scores of the 
Rorschach were found by Wittenborn 
(1950) to have high loadings on a factor 
labeled perceptual control, involving the 
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degree to which people are spontaneous 
and uncontrolled in their perceptual ap- 
proach; results supported by the factor 
analysis of Rorschach and MMPI vari- 
ables by Williams and Lawrence (1954). 
Mann (1956) reported a significant posi- 
tive relation between Sum C scores and 
responsiveness to the immediate en- 
vironment as indicated by the number 
of words given in free association which 
referred to the immediate environment. 
Allred (1959) found significant rela- 
tionships between use of color in draw- 
ings and general responsivity and aware- 
ness as reflected by the number of 
Rorschach responses and by the number 
of content categories used. That the 
more active individual is more likely to 
respond to color has been demonstrated 
by the studies of Singer and Spohn 
(1954) and of Clark (1948). Singer 
and Spohn found a regular increase in 
Sum C scores in conjunction with in- 
creased activity of schizophrenic pa- 
tients unobtrusively observed in a wait- 
ing room. Clark’s item analysis of 
MMPI responses led him to conclude 
that the higher Sum C score is related to 
a mild hypomanic tendency. Other stud- 
ies reviewed below agree that the more 
depressed individual is less responsive 
to color on the Rorschach, though color 
responsiveness as measured by picture 
preferences was not related to depres- 
sion. 

The response to color on the Ror- 
schach is compounded by other factors. 
Baughman (1958) has reviewed studies 
of the color shock phenomenon which 
demonstrate that it is largely independ- 
ent of the actual color of the ink blots. 
Siipola (1950) has shown that disrupt- 
ing effects of color in ink blots occur 
when the color is inappropriate to the 
form of the blot. George (1955) found 
that color cards are preferred to non- 
color cards in the Rorschach, but data 
reported by Wallen (1948) suggest that 
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this is because the color cards occur last 
in the standard Rorschach sequence. 
When the order of the cards was re- 
versed, the cards occurring last in the 
sequence (the noncolor cards) were pre- 
ferred. These findings have been con- 
vincingly interpreted by George and 
Bonney (1956) in the light of Helson’s 
adaptation level theory. Their thesis is 
that color influences responses in that it 
is a stimulus change. If color represents 
novel stimulus variation it is perceived 
as pleasant, but if it is completely un- 
expected it is disrupting or shocking. 
Response to color according to George 
and Bonney’s interpretation is diagnos- 
tic of the adaptation level of the subject. 

Summarizing the results in this area, 
studies do not show consistent relation- 
ships between Rorschach color scores 
and suggestibility and passivity, though 
an association is consistently reported 
between increased color responsiveness 
and nonpersonal environmental respon- 
siveness, as well as heightened activity 
in general. The color on Rorschach cards 
to a large degree appears to be simply à 
stimulus change to which the subject 
can react efficiently or awkwardly. 


ABUNDANCE OF AFFECT OR EMOTION 


Various clinical groups—hysterics, 
anxious and depressed neurotics, an 
psychotics in general—commonly recog- 
nized as emotionally disturbed have 
been studied so that comparisons be- 
tween these groups and normals in colot 
responsiveness will serve as tests of the £ 
theoretical position that use of color 1$ 
indicative of greater emotionality. 

The study of Rorschach records given 
by hysterics is limited to Fisher's report 
(1951a) which gave mean color scores 
of 20 females with diagnoses of CON- 
version hysteria. The scores were 
not statistically compared with controls, 
but the mean FC, CF, and Sum C scores 
would fall at the thirty-sixth, thirty" 
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fifth, and twenty-eighth percentiles, re- 
spectively, on the normative tables pub- 
lished by Cass and McReynolds (1951). 
In contrast, the mean number of pure C 
responses would fall at the eighty-sixth 
percentile. The reduced use of color in 
combination with other determinants, in 
contrast to the increased use of pure C 
would appear to be meaningful but needs 
further study. Comparable results were 
obtained by Williams and Lawrence 
(1954) through correlations between 
Rorschach and MMPI variables. The 
Hysteria scale of the MMPI and FC, 
CF, and C yielded tetrachoric correla- 
tions of —.35, .00, and .19, respectively. 
Clark (1948) found no significant dif- 
ference in Hysteria scores between 
groups scoring high and low on Sum C, 
though higher Sum C scores were sig- 
nificantly related to scores on the Hy- 
pochondriasis scale. The color measure 
developed from picture preferences by 
Cerbus and Nichols (1962) was corre- 
lated significantly with both Hypochon- 
driasis (.34) and Hysteria (.20) scales, 
but not to therapists’ ratings of patients’ 
use of hypochondriacal complaints. The 
significance of the correlations with the 
Hysteria and Hypochondriasis scales is 
questionable since there were only 2 sig- 
nificant correlations among 63 correla- 
tions calculated with the measure of 
preference for colored pictures. Three 
correlations among 63 would be expected 
to attain significance by chance. 

The relationship of color responsive- 
ness to anxiety has been investigated by 
means of anxiety scales and by experi- 
mentally induced anxiety. Results are 
essentially negative with either type of 
study. Rorschach color scores were 
found to be insignificantly related to 
scores on the Sarason anxiety question- 
naire (Cox & Sarason, 1954), to scores 
on the Taylor Anxiety scale (Goldstein 
& Goldberger, 1955; Holtzman, Iscoe, 
& Calvin, 1954; Westrope, 1953), and to 
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scores on the Taylor scale administered 
under experimental stress (Schwartz & 
Kates, 1957). Most of the studies used 
college students as subjects, but even 
with the wider range of anxiety assumed 
to be present in the patient population 
of the Goldstein and Goldberger study, 
results were negative. A significant posi- 
tive correlation was reported by Levy 
and Kurz (1957) between scores on the 
Taylor measure and a unique color 
measure from the Rorschach—the dif- 
ference in semantic differential ratings 
of chromatic and achromatic versions of 
the last three cards. This one signifi- 
cant report may be a chance finding, or 
it may be the result of the use of the 
unique color measure with anxious sub- 
jects, similar to the effect in the study 
by Matarazzo et al. (1952). Here anx- 
ious subjects did not use color to the 
same extent on the Rorschach and in a 
photic stimulation procedure, though 
nonanxious subjects obtained color 
scores in the two procedures which were 
significantly related, Finally, the pref- 
erence for noncolor pictures was found 
by Cerbus and Nichols (1962) to be in- 
significantly related to ratings of anxiety 
and to a measure of anxiety (the sum of 
the Hs, D, and Pt scores of the MMPI), 
the validity of which was demonstrated 
by Windle (1955). 

Turning to studies using anxiety in- 
duced by experimental stress, Eichler 
(1951), Westrope (1953), Baker and 
Harris (1949), Stopol (1954), and 
Allee (1948) reported insignificant 
results. Williams (1947) reported a 
positive finding, but Carlson and Laza- 
rus (1953) replicated Williams! study 
and found an insignificant trend in the 
opposite direction. 

The association of depression and re- 
duced responsiveness to color is gen- 
erally supported by research results us- 
ing the Rorschach. Costello (1958) 
reported that suicidal patients used 
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significantly less pure C responses 
than nonsuicidal patients, and Fisher 
(1951b) found that suicidals used 
significantly less FC responses. Kobler 
and Steil (1953) reviewed nine publica- 
tions which indicated that pure color 
responses are markedly reduced or 
absent in the protocols of depressed 
patients, though most of the studies re- 
viewed failed to use statistical tests. 
Kobler discussed Rorschach color scores 
of 45 females with diagnoses of involu- 
tional melancholia. Statistical compari- 
son with normals was not presented, but 
the means of all color scores were below 
the fortieth percentile on the norms 
presented by Cass and McReynolds 
(1951). Similarly, college subjects with 
high discrepancy scores between con- 
ceptions of “self” and "ideal self” were 
characterized by five of six commonly 
accepted signs of depression in their 
Rorschach protocols, reduced use of 
color by those with high unachieved 
goals being highly significant (Bills, 
1954). Wittenborn and Holzberg (1951) 
found that depressed patients used 
significantly less CF responses than 
manic patients, though no comparisons 
were made with normals. Tetrachoric 
correlations between the Depression 
scale of the MMPI and the CF and C 
scores were —.32 and —.37 (Williams 
& Lawrence, 1954). Finally, Levine, 
Grassi, and Gerson (1943) reported that 
hypnotically induced depression in a 
subject resulted in marked decrease in 
color responses. Incompatible results 
were obtained with the measure of pref- 
erence for noncolor pictures by Cerbus 
and Nichols (1962). Color responsive- 
ness so measured was not significantly 
related to the Depression scale of the 
MMPI, nor to therapists’ ratings of de- 
pression or suicidal ideation when con- 
sidered individually or as the factor of 
depression mania developed by Lorr 
(1953). 
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Differences between normals and 
neurotics in response to color have 
mainly involved the color shock phe- 
nomenon, but a number of studies, thor- 
oughly reviewed by Baughman (1958), 
have demonstrated that this phenomenon 
is independent of the color in the ink 
blots. Differences between  neurotics 
and normals were found by Brackbill 
(1951) when using chromatic versions 
of the Thematic Apperception Test 
cards, though the differences were not 
present when responding to the standard 
achromatic cards. Neurotics in his study 
told stories about the color cards that 
were rated significantly more depressed 
and more intellectual than the stories 
given by normals to the color cards, sug- 
gesting that the chromatic version is 
more effective in arousing unpleasant 
emotional associations and stimulating 
the expression of the subject’s prevailing 
mood. 

Most results indicate little association 
between schizophrenia and color re- 
sponsiveness, Friedman (1952) found 
no significant difference between nor- 
mal and schizophrenic groups in Ror- 
schach color scores. The Schizophrenia 
scale of the MMPI had tetrachoric cor- 
relations with FC, CF, C, and Sum C 
of .19, —.31, —.23, and .17, respectively, 
in samples of 100 subjects (Altus, 1948; 
Williams & Lawrence, 1954). Cerbus 
and Nichols (1962) found no significant 
relationship between the Schizophrenia 
scale of the MMPI and preferences for 
achromatic pictures. With designs of the 
Lowenfeld Mosaic Test, Levin (1956) 
reported no significant difference be- 
tween schizophrenics and normals in the 
"emphasis on form, with active rejec- 
tion of color." Only the results obtained 
by Keehn (1954) were incompatible 
with the above. On a measure derived 
from factor analysis of color-form sort- 
ing tasks, schizophrenics were signifi- 
cantly more responsive to color than 
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either neurotics or normals, though there response to the change in stimulus from 
was no difference between neurotics and achromatic to chromatic ink blots would 
normals. This positive finding appears indicate an alert, active subject. The 
to be the result of the different stimulus strong evidence for reduced use of color 
materials for which Keehn (1953), as by depressed patients concurs with color 
noted above, has given evidence that responsiveness as an indication of re- 
indicates it involves a different dimen- sponse to the impersonal environment 
sion than response to color in the Ror-\e since the depressed patient is withdrawn 
schach and other perceptual tasks. and is uninterested in the external en- 
Summarizing the results in this area, vironment. Increased use of color has 
evidence does not support an association not been found with hysterical or anx- 
between color responsiveness and hys-  ious individuals, nor with groups of 
terical and anxiety reactions. Rorschach ^ neurotics or schizophrenics, disclaiming 
measures of color responsiveness have any association between color response 
been demonstrated to be related to de- and affect, 
pression, though color responsiveness as These findings are not encouraging for 
measured by picture preference was not. those who would use response to color in 
Neurotics and schizophrenics were not individual clinical diagnosis. The usual 
significantly different from normals in ^ Rorschach color measures suffer from 
use of color on the Rorschach, though comparatively low reliability—the coef- 
significant differences have been ob- ficients ranging from .29 to .66 obtained 
tained with colored TAT cards and by Clark (1948) are typical—which 


color-form sorting tasks. detracts from their potential validity in 
D personality assessment, Although the 
pap si attempt by Cerbus and Nichols (1962) 


The insignificant results in some areas to use other stimuli with improved reli- 
and the indecisive results in others indi- ability to measure color responsiveness 
cate that widely accepted premises need indicated no association with personality 
re-examination. In the area of impul- variables, personality measures might be 
sivity, the highest correlation indicat- developed from stimuli such as the color- 
ing a relationship between ratings of form sorting tasks, or from adjustments 
impulsivity and color responsiveness of the usual Rorschach color measures. 
was reported by Gardner (1951), 
but since it was based on only 10 sub- REFERENCES 
jects it would need cross-validation be- rer, R. Rorschach responses of extreme 
fore being accepted. Available evidence deviates in an experimental stress condition. 
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